DSA Blog Series - Part 3: Graphs
- Dense graphs (E ≈ V²): Adjacency matrix
- Sparse graphs (E << V²): Adjacency list (most common)
- Frequent edge existence queries: Matrix
- Frequent neighbour iteration: List
- Start at a source vertex, mark it visited
- Add it to a queue
- While queue isn't empty:
- Dequeue a vertex
- Visit all its unvisited neighbors
- Mark them visited and enqueue them
- BFS finds the shortest path (fewest edges) in unweighted graphs
- Visits vertices in order of distance from source
- Uses a queue (FIFO structure)
- Shortest path in unweighted graphs: GPS navigation in equal-weight road networks
- Level-order traversal: Process tree nodes level by level
- Connected components: Find all isolated groups in a network
- Peer-to-peer networks: Locate closest nodes
- Social networks: Find degrees of separation (friends of friends)
- Web crawlers: Systematically explore websites
- Level 1: Direct friends
- Level 2: Friends of friends
- Level 3: Friends of friends of friends
- Mark current vertex as visited
- For each unvisited neighbour:
- Recursively call DFS on that neighbour
- Push source vertex onto a stack
- While stack isn't empty:
- Pop a vertex
- If not visited, mark it visited
- Push all unvisited neighbors onto stack
- Uses a stack (LIFO structure) or recursion
- May not find shortest path
- Explores branches fully before moving to siblings
- Natural for problems with "explore all possibilities" nature
- Cycle detection: Check if a graph contains cycles
- Topological sorting: Order tasks with dependencies
- Path finding: Find any path between two vertices
- Maze solving: Explore until you find the exit
- Connected components: Identify isolated groups
- Graph coloring: Assign colors so no neighbors share colors
- BFS: Shortest path, level-by-level exploration, uses queue
- DFS: Path existence, exhaustive search, uses stack/recursion
- White: Unvisited
- Gray: Currently being explored (in recursion stack)
- Black: Fully explored
- Start DFS from any vertex
- Mark it gray
- For each neighbour:
- If neighbour is gray: Cycle detected! (back edge)
- If neighbour is white: Recursively explore
- Mark current vertex black
- Deadlock detection: Circular waits in resource allocation
- Dependency validation: Circular dependencies in build systems
- Course scheduling: Can prerequisites be satisfied?
- Network routing: Avoid routing loops
- Perform DFS on all unvisited vertices
- After exploring all neighbors of a vertex, push it onto a stack
- The stack (reversed) is the topological order
- Calculate in-degree (incoming edges) for all vertices
- Add all vertices with in-degree 0 to a queue
- While queue isn't empty:
- Dequeue a vertex, add to result
- Reduce in-degree of neighbors
- If neighbor's in-degree becomes 0, enqueue it
- Course scheduling: Take prerequisites before advanced courses
- Build systems: Compile dependencies before targets (Make, Maven)
- Task scheduling: Execute tasks in dependency order
- Event scheduling: Organise events with timing constraints
- Spreadsheet formula evaluation: Calculate cells in correct order
- Initialise distances: source = 0, all others = infinity
- Use a min-heap (priority queue) starting with source
- While heap isn't empty:
- Extract vertex u with minimum distance
- For each neighbour v:
- If distance[u] + weight(u, v) < distance[v]:
- Update distance[v]
- Add v to heap
- Initialise distances: source = 0, others = infinity
- Relax all edges V-1 times:
- For each edge u → v:
- If distance[u] + weight(u, v) < distance[v]:
- Update distance[v]
- Check for negative cycles: If any edge can still be relaxed, a negative cycle exists
- Non-negative weights: Dijkstra (faster)
- Negative weights possible: Bellman-Ford
- All-pairs shortest paths: Floyd-Warshall (different approach, O(V³))
- Start with any vertex in the MST
- Maintain a min-heap of edges connecting MST to non-MST vertices
- Repeat until all vertices included:
- Extract minimum weight edge
- If it connects to a new vertex, add it to MST
- Add new edges from this vertex to heap
- Sort all edges by weight
- Initialise each vertex as a separate component (Union-Find)
- For each edge in sorted order:
- If it connects two different components:
- Add edge to MST
- Merge components (union operation)
- Stop when MST has V-1 edges
- Dense graphs: Prim's (more edges to process)
- Sparse graphs: Kruskal's (sorting fewer edges)
- In practice, performance is similar
- Network design: Minimise cable/pipeline length
- Approximation algorithms: TSP approximation uses MST
- Cluster analysis: Minimise within-cluster distances
- Circuit design: Minimise wire length
- Find(x): Return the representative (root) of x's set
- Union(x, y): Merge the sets containing x and y
- Kruskal's MST algorithm: Check if adding edge creates cycle
- Connected components: Track which vertices are connected
- Image segmentation: Group similar pixels
- Network connectivity: Check if nodes can communicate
- Percolation theory: Model fluid flow through porous materials
- Multiple entities with relationships → Graph representation
- Finding paths or connections → BFS/DFS
- Optimisation (shortest/minimum) → Dijkstra, MST, network flow
- Ordering with dependencies → Topological sort
- Grouping related items → Connected components, Union-Find
- Choose the right representation (usually adjacency list)
- Handle disconnected graphs (run algorithm on all components)
- Watch for cycles in directed graphs
- Consider edge weights (positive, negative, or zero)
- Optimise for your graph's density and size
- Implement BFS and DFS from scratch
- Solve graph problems on LeetCode (start with medium difficulty)
- Visualise algorithms using graph drawing tools
- Practice identifying when a problem is actually a graph problem
- Implement Union-Find with both optimizations
DSA Blog Series - Part 3: Graphs
🕸 Graphs (Where Paths and Choices Multiply)
21. Graph Representation: Adjacency List vs Matrix
Graphs model relationships between entities—social networks, road maps, web pages, neural networks. A graph consists of vertices (nodes) and edges (connections between nodes).
Two main representations:
Adjacency Matrix: A 2D array where matrix[i][j] = 1 if there's an edge from vertex i to j, else 0. For weighted graphs, store the weight instead of 1.
Pros: O(1) edge lookup—instantly check if vertices are connected. Great for dense graphs. Cons: O(V²) space even for sparse graphs. Iterating over neighbors takes O(V) time.
Adjacency List: An array of lists where list[i] contains all neighbors of vertex i. For weighted graphs, store (neighbour, weight) pairs.
Pros: O(V + E) space—efficient for sparse graphs. Iterating neighbors takes O(degree) time. Most real-world graphs are sparse. Cons: O(degree) edge lookup—must scan the neighbour list.
When to use which?
Most algorithms (BFS, DFS, Dijkstra) work better with adjacency lists. Social networks, road maps, and web graphs are sparse, making lists the go-to choice.
22. Breadth-First Search (BFS)
BFS explores a graph level by level, like ripples spreading in water. It visits all neighbors of a vertex before moving to their neighbors.
Algorithm:
Time complexity: O(V + E)—visit each vertex once, explore each edge once.
Key properties:
Applications:
Example: In a social network, BFS from person A finds:
BFS is your first tool for exploring graphs systematically. Its guarantee of shortest paths in unweighted graphs makes it indispensable.
23. Depth-First Search (DFS)
DFS explores a graph by going as deep as possible before backtracking. It's like exploring a maze by following one path until you hit a dead end, then backtracking.
Algorithm (recursive):
Algorithm (iterative):
Time complexity: O(V + E)—same as BFS but explores differently.
Key properties:
Applications:
DFS vs BFS:
Both are O(V + E), but choice depends on the problem. Need shortest path? BFS. Need to explore all possibilities or check for cycles? DFS.
24. Cycle Detection in Graphs
Detecting cycles is crucial in many applications: finding deadlocks, validating dependencies, checking if a graph is a tree.
In Directed Graphs (using DFS):
Track three states for each vertex:
Algorithm:
A back edge (edge to an ancestor in DFS tree) indicates a cycle.
In Undirected Graphs:
Simpler approach—during DFS, if you encounter a visited neighbour that isn't the parent, a cycle exists.
Why it matters: If you visit B from A, then later visit A from B (and B wasn't A's parent), you've found a cycle.
Applications:
Special case—DAG (Directed Acyclic Graph): Graphs with no cycles. Many algorithms (topological sort, dynamic programming on graphs) only work on DAGs.
Cycle detection is a fundamental graph property check. Master it for both directed and undirected cases.
25. Topological Sorting
Topological sorting orders vertices in a directed acyclic graph (DAG) such that for every edge u → v, u comes before v. Think of it as scheduling tasks with dependencies.
Example: You can't wear shoes before socks. Topological sort gives: socks → shoes.
Algorithm 1: DFS-based (most elegant)
Why it works: A vertex is pushed only after all its descendants are processed, ensuring dependencies are satisfied.
Algorithm 2: Kahn's Algorithm (BFS-based)
Cycle detection bonus: If topological sort produces fewer than V vertices, the graph has a cycle.
Time complexity: O(V + E) for both approaches.
Real-world applications:
Key insight: Topological sort only works on DAGs. If there's a cycle, no valid ordering exists—you can't have mutual dependencies.
26. Shortest Path Algorithms (Dijkstra, Bellman-Ford)
Finding the shortest path between vertices is one of the most practical graph problems. Different algorithms handle different constraints.
Dijkstra's Algorithm:
Works for graphs with non-negative edge weights. Use a greedy approach.
Algorithm:
Time complexity: O((V + E) log V) with min-heap.
Key insight: Greedy works because negative weights don't exist—once a vertex is processed, we've found its shortest path.
Bellman-Ford Algorithm:
Handles negative edge weights. Detects negative cycles.
Algorithm:
Time complexity: O(V × E)—slower but more versatile.
When to use which:
Applications: GPS navigation, network routing, arbitrage detection (negative cycles in currency exchange), resource optimisation.
27. Minimum Spanning Tree (Prim's & Kruskal's)
A spanning tree connects all vertices with the minimum number of edges (V-1 edges, no cycles). A Minimum Spanning Tree (MST) does this with minimum total edge weight.
Problem: Connect all cities with roads minimising total construction cost.
Prim's Algorithm (vertex-centric):
Grows the MST one vertex at a time.
Algorithm:
Time complexity: O(E log V) with min-heap.
Kruskal's Algorithm (edge-centric):
Considers edges in order of increasing weight.
Algorithm:
Time complexity: O(E log E) = O(E log V) since E ≤ V².
Which to use?
Real-world applications:
Both algorithms are greedy and always produce an MST. The choice depends on graph density and implementation preference.
28. Union-Find / Disjoint Set Union (DSU)
Union-Find is a data structure that efficiently tracks a partition of elements into disjoint sets. It supports two operations: finding which set an element belongs to and merging two sets.
Operations:
Naive implementation: Each element points to its parent. Find follows parent pointers to root. Union makes one root point to another.
Problem: Trees can become skewed, making Find O(n).
Optimisation 1: Union by Rank Always attach the shorter tree under the taller tree's root. This keeps trees balanced, ensuring O(log n) operations.
Optimisation 2: Path Compression During Find, make every node on the path point directly to the root. This flattens the tree, making future operations faster.
Combined: Union by rank + path compression gives nearly O(1) amortised time—specifically O(α(n)) where α is the inverse Ackermann function, which grows incredibly slowly (≤ 4 for any practical n).
Implementation sketch:
parent[i] = i initially (each element is its own set)
rank[i] = 0 initially
Find(x):
if parent[x] != x:
parent[x] = Find(parent[x]) // path compression
return parent[x]
Union(x, y):
root_x = Find(x)
root_y = Find(y)
if root_x == root_y: return
// Union by rank
if rank[root_x] < rank[root_y]:
parent[root_x] = root_y
elif rank[root_x] > rank[root_y]:
parent[root_y] = root_x
else:
parent[root_y] = root_x
rank[root_x]++
Applications:
Union-Find is elegant, efficient, and appears in countless algorithms. It's a must-know for graph problems and competitive programming.
29. Advanced Graph Topics Preview
As you progress beyond fundamentals, several advanced graph concepts become important:
Strongly Connected Components (SCCs): In directed graphs, subgraphs where every vertex is reachable from every other. Found using Kosaraju's or Tarjan's algorithm in O(V + E). Applications: analysing web link structure, finding dependency loops.
Articulation Points and Bridges: Vertices/edges whose removal disconnects the graph. Critical for network reliability—identify single points of failure. Found using modified DFS in O(V + E).
Bipartite Graphs: Graphs whose vertices can be divided into two sets where edges only connect vertices from different sets. Detected using graph coloring (2-color test) with BFS/DFS. Applications: matching problems, scheduling, resource allocation.
Network Flow: Maximum flow through a network from source to sink. Ford-Fulkerson method, Edmonds-Karp algorithm. Applications: traffic routing, supply chain optimisation, bipartite matching.
Traveling Salesman Problem (TSP): Visit all vertices once with minimum cost. NP-hard, but solvable approximately or with dynamic programming for small graphs. Applications: route optimisation, circuit board drilling.
Graph Coloring: Assign colors to vertices so no adjacent vertices share colors. NP-hard for general graphs, but polynomial for special cases. Applications: register allocation in compilers, frequency assignment, scheduling.
All-Pairs Shortest Paths: Floyd-Warshall algorithm finds shortest paths between all pairs of vertices in O(V³). Applications: network analysis, transitive closure.
These topics build on the fundamentals you've learned. Each solves specific real-world problems and appears in technical interviews and competitive programming.
30. Graph Algorithms in Practice
Understanding when and how to apply graph algorithms separates theoretical knowledge from practical problem-solving skills.
Problem identification:
Common problem patterns:
Pattern 1: "Can we reach B from A?" → BFS or DFS
Pattern 2: "What's the shortest/cheapest way?" → Dijkstra (positive weights), Bellman-Ford (negative weights), BFS (unweighted)
Pattern 3: "Connect everything minimally" → MST (Prim's or Kruskal's)
Pattern 4: "Process tasks with dependencies" → Topological sort, check for cycles
Pattern 5: "Are these items connected?" → Union-Find, connected components
Pattern 6: "Find all paths/combinations" → DFS with backtracking
Real-world examples:
Social networks: Graphs where users are vertices, friendships are edges. BFS finds degrees of separation. Community detection uses clustering algorithms.
Maps and navigation: Road networks as graphs. Dijkstra's finds shortest routes. Real-time updates require dynamic algorithms.
Compilers: Data-flow analysis, control-flow graphs, register allocation (graph coloring), instruction scheduling (topological sort).
Web search: PageRank uses graph algorithms to rank web pages by link structure.
Recommendation systems: Bipartite graphs connect users and items. Graph algorithms find patterns and make predictions.
Implementation tips:
Graphs are everywhere in computing. Mastering graph algorithms opens doors to solving complex real-world problems across all domains.
What's Next?
You've now conquered graphs—one of the most versatile and powerful data structures. From simple traversals to sophisticated shortest path algorithms, these techniques solve countless real-world problems.
Practice Tips:
Comments
Post a Comment