Fiveable

🧠Thinking Like a Mathematician Unit 10 Review

QR code for Thinking Like a Mathematician practice questions

10.7 Greedy algorithms

10.7 Greedy algorithms

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🧠Thinking Like a Mathematician
Unit & Topic Study Guides

Greedy algorithm fundamentals

A greedy algorithm solves a problem by making the best-looking choice at each step, without worrying about what comes later. This approach is surprisingly effective for many optimization problems, though it doesn't always guarantee the best possible answer.

Definition and characteristics

A greedy algorithm works through an iterative decision-making process: at each step, it picks whatever option looks best right now, then moves on and never revisits that choice.

Key traits of greedy algorithms:

  • They build a solution incrementally, one piece at a time
  • Each choice is irrevocable: once you pick it, you're committed
  • They're typically simple to implement and computationally efficient
  • They don't always produce the globally optimal solution

The core idea is straightforward: if you always make the locally optimal choice, you hope it leads to a globally optimal result. Sometimes it does, sometimes it doesn't. Knowing when greedy works is the real skill.

Optimization problems

Greedy algorithms are most commonly applied to optimization problems, where you're trying to maximize or minimize some objective (shortest path, lowest cost, most items selected, etc.).

They work best when the problem has two properties:

  • Greedy choice property: making the locally best choice actually leads to a global optimum
  • Optimal substructure: the optimal solution contains optimal solutions to its subproblems

Classic examples include finding a minimum spanning tree (Kruskal's algorithm) and computing shortest paths (Dijkstra's algorithm). For NP-hard problems like the traveling salesman problem, greedy algorithms won't find the exact best answer, but they can provide reasonable approximate solutions quickly.

Local vs global optima

A local optimum is the best solution among nearby alternatives. A global optimum is the best solution among all possible solutions.

The fundamental risk with greedy algorithms is getting trapped at a local optimum. Because greedy never backtracks, it can settle on a solution that looks great compared to its neighbors but falls short of the true best answer. This is the core trade-off: greedy algorithms are fast, but they sacrifice the guarantee of finding the absolute best solution.

Other techniques like simulated annealing and genetic algorithms are specifically designed to escape local optima, though they come with their own trade-offs in complexity and runtime.

Design principles

Three principles determine whether a greedy approach will actually work for a given problem. If you can verify these hold, you can be confident the greedy solution is correct.

Greedy choice property

This property means that choosing the locally optimal option at each step will eventually produce a globally optimal solution. In other words, you never need to undo a previous decision.

This is what separates problems where greedy works from problems where it doesn't. For example, Huffman coding has this property: assigning the shortest codes to the most frequent symbols at each step always leads to an optimal overall encoding.

Proving the greedy choice property is essential for showing that a greedy algorithm is correct. Without it, you're just hoping for the best.

Optimal substructure

A problem has optimal substructure if the optimal solution to the whole problem contains optimal solutions to its subproblems.

Consider shortest paths: if the shortest path from A to C goes through B, then the A-to-B portion of that path must itself be the shortest path from A to B. If it weren't, you could swap in a shorter A-to-B path and get a shorter overall route, which contradicts your starting assumption.

This property also shows up in dynamic programming. The difference is that greedy algorithms commit to one choice immediately, while dynamic programming explores multiple subproblem solutions before deciding.

Iterative decision-making

Greedy algorithms make decisions sequentially and permanently:

  1. Evaluate the available options
  2. Pick the one that looks best according to your greedy criterion
  3. Commit to that choice (no going back)
  4. Repeat with the remaining problem

The greedy criterion you choose matters enormously. In Kruskal's algorithm, the criterion is "pick the lowest-weight edge that doesn't create a cycle." A different criterion could produce a completely different (and possibly wrong) result.

Common greedy algorithms

These three algorithms appear frequently in courses and interviews. Each one illustrates a different way greedy logic can be applied.

Kruskal's algorithm

Kruskal's algorithm finds the minimum spanning tree of a weighted, undirected graph (the subset of edges that connects all vertices with the lowest total weight).

How it works:

  1. Sort all edges by weight, from lowest to highest
  2. Pick the lowest-weight edge that doesn't create a cycle
  3. Add it to the spanning tree
  4. Repeat until you've connected all vertices (you'll have V1V - 1 edges)

To efficiently check whether adding an edge creates a cycle, Kruskal's uses a disjoint-set (union-find) data structure.

  • Time complexity: O(ElogE)O(E \log E) where EE is the number of edges
  • Applications: network design, clustering, image segmentation

Dijkstra's algorithm

Dijkstra's algorithm computes the shortest path from a single source vertex to every other vertex in a weighted graph (with non-negative weights).

How it works:

  1. Set the distance to the source as 0 and all other distances as infinity
  2. Mark all vertices as unvisited
  3. Pick the unvisited vertex with the smallest tentative distance
  4. Update the distances of its unvisited neighbors if a shorter path is found through the current vertex
  5. Mark the current vertex as visited
  6. Repeat until all vertices are visited
  • Time complexity: O((V+E)logV)O((V + E) \log V) with a binary heap
  • Applications: GPS navigation, routing protocols, social network analysis

Huffman coding

Huffman coding creates an optimal set of prefix codes for data compression, assigning shorter binary codes to more frequent symbols.

How it works:

  1. Count the frequency of each symbol in the data
  2. Create a leaf node for each symbol and add them to a priority queue (sorted by frequency)
  3. Remove the two nodes with the lowest frequency
  4. Create a new internal node with these two as children, with frequency equal to their sum
  5. Add the new node back to the queue
  6. Repeat until only one node remains (the root of the Huffman tree)
  7. Assign codes by traversing the tree: left = 0, right = 1
  • Time complexity: O(nlogn)O(n \log n) where nn is the number of unique symbols
  • Applications: JPEG, MP3, ZIP, and other compression formats

Analysis techniques

To trust a greedy algorithm, you need to prove it actually works and understand how fast it runs.

Proof of correctness

Proving a greedy algorithm is correct means showing it always produces an optimal solution, not just sometimes. Two common proof strategies:

  • Proof by contradiction: Assume the greedy solution isn't optimal. Show that any supposedly better solution can be transformed into the greedy solution without losing quality, creating a contradiction.
  • Mathematical induction: Show the greedy choice is safe at the first step, then prove that if it's safe for kk steps, it's safe for k+1k + 1 steps.

Both approaches require demonstrating the greedy choice property and optimal substructure. For example, Kruskal's algorithm is proven correct using the cut property: the lightest edge crossing any cut of the graph must belong to the minimum spanning tree.

Definition and characteristics, Category:Greedy algorithms - Wikimedia Commons

Time complexity

Time complexity measures how an algorithm's running time grows as the input gets larger, expressed using Big O notation.

For greedy algorithms, the dominant cost is often a sorting step. The greedy selection itself is usually linear, but sorting the input first costs O(nlogn)O(n \log n). Your choice of data structure also matters: using a binary heap in Dijkstra's algorithm brings the complexity down to O((V+E)logV)O((V + E) \log V) compared to O(V2)O(V^2) with a simple array.

Space complexity

Space complexity measures how much memory the algorithm needs beyond the input itself.

  • Kruskal's algorithm needs O(V)O(V) extra space for the disjoint-set data structure
  • Dijkstra's algorithm needs O(V)O(V) for the distance array and priority queue
  • Huffman coding needs O(n)O(n) for the tree structure

Space complexity becomes critical when working with large datasets or in memory-constrained environments.

Advantages and limitations

Efficiency vs optimality

Greedy algorithms are typically fast, often running in O(nlogn)O(n \log n) or better. The trade-off is that they don't always find the best possible solution.

When greedy does guarantee optimality (Huffman coding, Kruskal's algorithm), you get the best of both worlds: a correct answer computed efficiently. When it doesn't (the 0/1 knapsack problem, for instance), you get a fast answer that might be good enough but isn't guaranteed to be the best.

Problem suitability

Greedy works well when:

  • The problem has the greedy choice property and optimal substructure
  • You need a fast solution and can verify greedy applies

Greedy does not work well when:

  • The problem requires considering future consequences (0/1 knapsack)
  • Backtracking or exploring multiple paths is necessary
  • Local choices can lead you away from the global optimum

The activity selection problem (pick the most non-overlapping activities) is a textbook greedy success. The traveling salesman problem (visit all cities with minimum total distance) is a textbook greedy failure.

Approximation algorithms

Even when greedy can't find the exact optimal solution, it can still be useful. For NP-hard problems where exact solutions are computationally impractical, greedy algorithms often serve as approximation algorithms with guaranteed performance bounds.

For example, the greedy approach to the set cover problem achieves an approximation ratio of O(lnn)O(\ln n), meaning its solution is at most a logarithmic factor worse than optimal. That's a strong guarantee for a problem where finding the exact answer could take exponential time.

Applications in computer science

Scheduling problems

Greedy algorithms are natural fits for scheduling:

  • Activity/interval selection: sort by finish time, always pick the earliest-finishing compatible activity. This maximizes the number of non-overlapping activities.
  • Job sequencing with deadlines: prioritize jobs by profit and schedule each in the latest available slot before its deadline.
  • Earliest Deadline First (EDF): a real-time scheduling strategy that always runs the task with the nearest deadline.

These show up in operating systems, project management, and resource allocation.

Data compression

  • Huffman coding builds optimal prefix codes for lossless compression
  • Run-length encoding replaces repeated sequences with a count and value (e.g., "AAAA" becomes "4A")
  • LZW compression greedily builds a dictionary of recurring patterns as it scans the data

These techniques underpin file formats like ZIP, GZIP, GIF, and PNG.

Network design

  • Minimum spanning trees (Kruskal's, Prim's) find the cheapest way to connect all nodes in a network
  • Dijkstra's algorithm powers routing and navigation systems
  • Greedy channel allocation in wireless networks assigns frequencies to maximize spectrum usage

Applications span telecommunications, transportation, and computer networking.

Comparison with other approaches

Greedy vs dynamic programming

Both exploit optimal substructure, but they differ in a key way:

  • Greedy makes one choice at each step and never looks back
  • Dynamic programming considers all subproblem solutions before making a decision

Dynamic programming guarantees optimality but is often slower and uses more memory. Greedy is faster but only works when the greedy choice property holds.

The coin change problem illustrates this well. With US coin denominations (25, 10, 5, 1), greedy works perfectly: always pick the largest coin that fits. But with denominations like {4, 3, 1}, greedy fails. To make 6 cents, greedy picks 4+1+1 (three coins), but the optimal answer is 3+3 (two coins). Dynamic programming handles both cases correctly.

Greedy vs brute force

  • Brute force checks every possible solution and guarantees finding the best one
  • Greedy checks only one path through the decision space

Brute force is impractical for large inputs because the number of possible solutions grows exponentially. Greedy sacrifices the guarantee of optimality for dramatically better speed.

Definition and characteristics, Optimal Decision Boundaries

Greedy vs divide-and-conquer

  • Greedy builds a solution incrementally through a sequence of local decisions
  • Divide-and-conquer splits the problem into independent subproblems, solves each recursively, then combines results

Divide-and-conquer applies to a broader range of problems (merge sort, quicksort, binary search) and often guarantees optimality. Greedy tends to have simpler implementations when it applies.

Implementation strategies

Data structures for greedy algorithms

The right data structure can make or break a greedy algorithm's performance:

  • Priority queues (heaps): efficiently retrieve the minimum or maximum element. Essential for Dijkstra's algorithm and Huffman coding.
  • Disjoint-set (union-find): tracks connected components and detects cycles. Used in Kruskal's algorithm.
  • Balanced BSTs: support fast insertion, deletion, and lookup when you need sorted access.
  • Adjacency lists: represent graphs compactly for network-related algorithms.

Sorting in greedy algorithms

Many greedy algorithms start with a sorting step because the greedy criterion depends on processing elements in a specific order.

  • Activity selection sorts by finish time
  • Kruskal's sorts edges by weight
  • Fractional knapsack sorts by value-to-weight ratio

Since sorting typically costs O(nlogn)O(n \log n), it often dominates the overall time complexity. The greedy selection loop after sorting is usually just O(n)O(n).

Incremental construction

Greedy algorithms build solutions piece by piece:

  1. Start with an empty solution
  2. At each step, add the element that best satisfies the greedy criterion
  3. Update your data structures to reflect the new partial solution
  4. Stop when the solution is complete or no more valid additions exist

Kruskal's algorithm is a clean example: start with no edges, add the cheapest valid edge at each step, and stop when you have V1V - 1 edges.

Case studies and examples

Coin change problem

Goal: Make a given amount using the fewest coins possible.

Greedy strategy: Always pick the largest denomination that doesn't exceed the remaining amount.

With US coins (25¢, 10¢, 5¢, 1¢), making 41¢:

  1. Pick 25¢ (remaining: 16¢)
  2. Pick 10¢ (remaining: 6¢)
  3. Pick 5¢ (remaining: 1¢)
  4. Pick 1¢ (remaining: 0¢)

Result: 4 coins, which is optimal.

But with denominations {4, 3, 1}, making 6: greedy picks 4+1+1 = three coins, while 3+3 = two coins is better. This shows that greedy only works for certain coin systems. Time complexity is O(n)O(n) where nn is the number of denominations.

Activity selection problem

Goal: Select the maximum number of non-overlapping activities from a set of activities with start and finish times.

Greedy strategy:

  1. Sort activities by finish time
  2. Select the first activity
  3. For each remaining activity, select it if its start time is ≥ the finish time of the last selected activity

This works because picking the earliest-finishing activity leaves the most room for future activities. The greedy choice property holds here, so this always produces an optimal solution.

  • Time complexity: O(nlogn)O(n \log n) for sorting, O(n)O(n) for selection

Fractional knapsack problem

Goal: Fill a knapsack of limited capacity with items to maximize total value. You can take fractions of items.

Greedy strategy:

  1. Calculate the value-to-weight ratio for each item
  2. Sort items by this ratio in decreasing order
  3. Take as much of the highest-ratio item as fits
  4. If there's remaining capacity, move to the next item
  5. Continue until the knapsack is full

This greedy approach is optimal for the fractional version. The 0/1 knapsack (where you must take whole items or nothing) does not have the greedy choice property, so it requires dynamic programming instead.

  • Time complexity: O(nlogn)O(n \log n) for sorting, O(n)O(n) for selection

Advanced topics

Matroids and greedy algorithms

A matroid is a mathematical structure that generalizes the idea of linear independence. It provides a formal framework for proving when greedy algorithms are guaranteed to find optimal solutions.

The key result: if a problem's feasible solutions form a matroid, then the greedy algorithm (selecting elements in order of weight/value) always produces an optimal solution. Minimum spanning trees and certain scheduling problems have matroid structure, which is why greedy works perfectly for them.

Randomized greedy algorithms

These algorithms introduce randomness into the greedy decision-making process. Instead of always picking the single best option, they might randomly choose among the top few options.

This can improve average-case performance and help avoid pathological worst-case inputs. Randomized rounding in set cover approximation is one example. Performance is analyzed using expected values and probability theory rather than worst-case guarantees.

Online greedy algorithms

Standard greedy algorithms see the entire input before making decisions. Online greedy algorithms must make irrevocable decisions as input arrives, without knowing what comes next.

Performance is measured using competitive analysis: comparing the online algorithm's result to what an optimal offline algorithm (with full knowledge) would achieve. Online interval scheduling, for example, can be analyzed by its competitive ratio, which bounds how much worse the online solution can be compared to the optimal.