Why This Matters
Minimum Spanning Tree (MST) algorithms answer a deceptively simple question: how do you connect everything with the least total cost? This concept powers real-world applications from network design and circuit layout to clustering algorithms and approximation schemes for NP-hard problems. You'll encounter MSTs whenever efficiency meets connectivity, making them a cornerstone of algorithm design.
Understanding these algorithms means grasping more than just their steps. You're being tested on greedy algorithm strategies, data structure selection, and time complexity trade-offs. Don't just memorize which algorithm sorts edges first; know why certain approaches excel on sparse versus dense graphs, how the cut property guarantees correctness, and when to reach for each tool.
Greedy Edge-Selection Algorithms
These algorithms build the MST by greedily selecting edges based on weight. Their correctness relies on the cut property: for any cut of the graph (a partition of vertices into two non-empty sets), the minimum-weight edge crossing that cut must belong to some MST.
Kruskal's Algorithm
- Sorts all edges by weight globally, then processes them in ascending order. This makes it edge-centric rather than growing from a single source.
- Uses a disjoint-set (union-find) data structure to track connected components. When considering an edge, you check whether its two endpoints are already in the same component. If they are, adding the edge would create a cycle, so you skip it. If not, you union the two components. With path compression and union by rank, each operation runs in nearly constant time (inverse Ackermann).
- Time complexity: O(ElogE), dominated by the initial sort. This makes it a strong choice for sparse graphs where E is much smaller than V2.
Steps:
- Sort all edges by weight (ascending).
- Initialize each vertex as its own component using union-find.
- For each edge (in sorted order), check if its endpoints belong to different components.
- If yes, add the edge to the MST and union the two components.
- If no, skip it (it would create a cycle).
- Stop once you've added Vโ1 edges.
Reverse-Delete Algorithm
- Works backward from the full edge set. Instead of adding cheap edges, it removes expensive ones. It considers edges in decreasing weight order and deletes each one unless doing so would disconnect the graph.
- Time complexity: O(ElogE), theoretically equivalent to Kruskal's, but often slower in practice because checking whether removing an edge disconnects the graph is more expensive than the simple union-find lookups Kruskal's uses.
Compare: Kruskal's vs. Reverse-Delete: both achieve O(ElogE) complexity, but Kruskal's builds up by adding safe edges while Reverse-Delete tears down by removing unnecessary ones. If asked to explain greedy MST strategies, Kruskal's is your cleaner example; Reverse-Delete demonstrates that greedy approaches can work in multiple directions.
Vertex-Growing Algorithms
These algorithms grow the MST outward from a starting vertex, repeatedly adding the cheapest edge that connects the current tree to a new vertex. The priority queue is their central data structure.
Prim's Algorithm
- Grows from a single seed vertex. At each step, it picks the minimum-weight edge connecting the current tree to a vertex not yet in the tree, then adds that vertex and edge.
- Priority queue is essential. Using a binary heap gives O((E+V)logV) complexity. Switching to a Fibonacci heap improves this to O(E+VlogV), which matters on dense graphs where E is large relative to V.
- Excels on dense graphs. When EโV2, the vertex-focused approach avoids the expensive edge sort that Kruskal's requires.
Steps:
- Pick any starting vertex and add it to the tree.
- Insert all edges from that vertex into the priority queue.
- Extract the minimum-weight edge from the queue. If it connects to an unvisited vertex, add that vertex and edge to the tree.
- Insert all edges from the newly added vertex into the queue (that lead to unvisited vertices).
- Repeat steps 3-4 until all vertices are in the tree.
Jarnรญk's Algorithm (Prim-Jarnรญk Algorithm)
- Historically the original vertex-growing approach. Jarnรญk published this method in 1930, predating Prim's 1957 rediscovery.
- Identical mechanics to Prim's in every way: same priority queue usage, same growth pattern, same time complexity.
- The name appears in theoretical contexts. Knowing this is the same algorithm helps avoid confusion when reading academic sources or proofs that reference "Jarnรญk's algorithm."
Compare: Prim's vs. Kruskal's is the classic MST showdown. Prim's is vertex-centric (grow the tree), Kruskal's is edge-centric (sort and merge). Choose Prim's for dense graphs, especially with an adjacency matrix representation. Choose Kruskal's for sparse graphs or when edges arrive pre-sorted.
Component-Merging Algorithms
This approach treats the graph as a forest of single-vertex trees and repeatedly merges components by their minimum outgoing edges.
Borลฏvka's Algorithm
- Processes all components simultaneously. In each phase, every component identifies its minimum-weight outgoing edge (the cheapest edge connecting it to a different component), and all such edges are added at once.
- Logarithmic phases guarantee efficiency. Because the number of components at least halves each round (every component merges with at least one neighbor), the algorithm completes in at most O(logV) phases. Each phase scans all edges once, giving a total time complexity of O(ElogV).
- Highly parallelizable. Since each component's minimum-edge search is independent of the others, these operations can run in parallel. This makes Borลฏvka's the natural choice for distributed computing and GPU implementations.
Steps:
- Initialize each vertex as its own component.
- For each component, find the minimum-weight edge connecting it to a different component.
- Add all such edges (handling ties consistently to avoid issues with non-unique edge weights).
- Merge the connected components.
- Repeat steps 2-4 until only one component remains.
Compare: Borลฏvka's vs. Prim's: both grow trees, but Prim's grows one tree sequentially while Borลฏvka's grows all trees in parallel. Borลฏvka's O(ElogV) complexity can outperform Prim's on certain graph structures, and its parallel nature makes it the go-to choice for large-scale distributed systems.
Quick Reference Table
|
| Edge-centric greedy selection | Kruskal's, Reverse-Delete |
| Vertex-growing with priority queue | Prim's, Jarnรญk's |
| Parallel component merging | Borลฏvka's |
| Best for sparse graphs | Kruskal's, Borลฏvka's |
| Best for dense graphs | Prim's, Jarnรญk's |
| Uses disjoint-set (union-find) | Kruskal's, Reverse-Delete |
| Uses priority queue | Prim's, Jarnรญk's |
| Suitable for distributed computing | Borลฏvka's |
Self-Check Questions
-
Which two algorithms share the same O(ElogE) time complexity but approach the problem from opposite directions?
-
If you're given a dense graph represented as an adjacency matrix, which algorithm would you choose and why? What data structure makes this choice efficient?
-
Compare Kruskal's and Prim's in terms of their greedy strategies, data structures, and ideal graph types.
-
Why is Borลฏvka's algorithm particularly well-suited for distributed computing, and what property of its execution enables parallelization?
-
An exam question asks you to prove that a greedy MST algorithm produces an optimal solution. Which fundamental property of MSTs would you invoke, and how does it guarantee that locally optimal edge choices lead to a globally optimal tree?