Minimum Spanning Tree Algorithms
A Minimum Spanning Tree (MST) is the cheapest way to connect every node in a weighted, undirected graph without forming any cycles. MST algorithms show up constantly in network design problems: laying cable between buildings, connecting routers, or planning road systems where you want full connectivity at minimum cost.
This section covers the two classic MST algorithms, Prim's and Kruskal's, along with the properties that make MSTs work.
Minimum Spanning Tree Algorithms

Minimum spanning tree properties
An MST is a subset of edges from a weighted, connected, undirected graph that connects all vertices with the lowest possible total edge weight. Because it's a tree, it contains no cycles, and it always has exactly edges for vertices.
Whether a graph has one unique MST or several valid ones depends on the edge weights:
- Distinct edge weights guarantee a unique MST.
- Duplicate edge weights can produce multiple valid MSTs (each with the same total weight, but using different edges).
Two properties are especially useful for proving correctness of MST algorithms:
- Cut property: Take any cut (a partition of vertices into two groups). The minimum-weight edge crossing that cut must be in the MST. This is why greedy choices work: picking the lightest crossing edge is always safe.
- Cycle property: For any cycle in the graph, the heaviest edge in that cycle is not in the MST. If it were included, you could swap it out for a cheaper edge and reduce total weight.

Steps of Prim's algorithm
Prim's grows the MST outward from a single starting vertex, always attaching the nearest reachable vertex. Think of it as expanding a connected region one vertex at a time.
- Initialize an empty set to hold MST vertices. Assign a key value to every vertex: set the starting vertex's key to 0 and all others to infinity.
- While there are vertices not yet in the MST:
- Select the vertex with the smallest key value that isn't in the MST yet.
- Add that vertex to the MST set.
- For each neighbor of the newly added vertex, check: is the edge weight to that neighbor less than the neighbor's current key? If so, update the neighbor's key to the smaller weight (and record which edge produced it).
- Stop when all vertices are in the MST. The recorded edges form the tree.
The key values act as "best known cost to reach this vertex from the MST so far." A priority queue (typically a binary heap) makes step 2.i efficient by quickly extracting the minimum-key vertex.
Steps of Kruskal's algorithm
Kruskal's takes a different approach: instead of growing from one vertex, it considers all edges globally, cheapest first, and merges components together.
- Sort all edges in non-decreasing order of weight.
- Initialize an empty edge set for the MST. Create a disjoint-set (union-find) data structure where each vertex starts in its own component.
- Iterate through the sorted edges, from lightest to heaviest:
- For the current edge , use
Findto check whether and are in the same component. - If they're in different components, add the edge to the MST and call
Unionto merge their components. - If they're already in the same component, skip the edge (adding it would create a cycle).
- For the current edge , use
- Stop once the MST contains edges.
Cycle detection here is handled entirely by the union-find structure. Two vertices in the same set are already connected, so any edge between them would close a cycle.
Prim's vs Kruskal's algorithms
Both algorithms are greedy: they make locally optimal choices that lead to a globally optimal MST. When all edge weights are distinct, they produce the exact same tree.
Time complexity
| Algorithm | Complexity | Breakdown |
|---|---|---|
| Prim's (binary heap) | Each edge relaxation and vertex extraction costs | |
| Kruskal's | Dominated by sorting; union-find operations run in per call, which is nearly constant | |
| Since , is at most , so both algorithms have the same asymptotic order. |
When to use which
- Prim's works well on dense graphs (many edges) because it processes vertices via a priority queue and pairs naturally with an adjacency matrix. It doesn't need to sort all edges upfront.
- Kruskal's tends to be faster in practice on sparse graphs (few edges) because sorting a small edge list is cheap, and the union-find operations are nearly constant time.
If you're choosing between them on an exam, a good rule of thumb: dense graph → Prim's, sparse graph → Kruskal's. If edge weights are already sorted, Kruskal's gets even cheaper since you skip the sorting step.