Fiveable

🧠Thinking Like a Mathematician Unit 7 Review

QR code for Thinking Like a Mathematician practice questions

7.6 Graph traversals

7.6 Graph traversals

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🧠Thinking Like a Mathematician
Unit & Topic Study Guides

Types of graph traversals

Graph traversals are systematic methods for visiting every vertex in a graph. They're essential because many problems in computer science and math reduce to "explore a graph and find something." The two core approaches are depth-first search (DFS) and breadth-first search (BFS), and they differ in the order they visit vertices.

Depth-first search (DFS)

DFS explores as far as possible along each branch before backtracking. Think of it like navigating a maze by always taking the next unexplored turn and only retracing your steps when you hit a dead end.

  • Uses a stack data structure (either explicitly or implicitly through recursion)
  • Prioritizes depth over breadth, reaching leaf nodes quickly
  • Useful for topological sorting, cycle detection, and exploring tree-like structures
  • Can be implemented recursively or iteratively

Breadth-first search (BFS)

BFS explores all vertices at the current depth before moving to the next depth level. Picture dropping a stone in water: the ripples spread outward evenly in all directions.

  • Uses a queue data structure to manage exploration order
  • Guarantees finding the shortest path in unweighted graphs (fewest edges)
  • Effective for level-order traversals and distance calculations
  • Typically implemented iteratively

Comparison of DFS vs BFS

FeatureDFSBFS
Data structureStackQueue
Memory (deep graphs)Generally lessGenerally more
Shortest path (unweighted)Not guaranteedGuaranteed
ImplementationOften recursiveTypically iterative
Best forExploring deep structures, backtrackingFinding nearby nodes, shortest paths

The choice between them depends on what you need. If you want the shortest path in an unweighted graph, use BFS. If you need to explore exhaustively or detect cycles, DFS is often simpler.

DFS is a cornerstone algorithm that demonstrates recursive thinking. Its logic is simple: go deeper whenever you can, backtrack when you must. That simplicity makes it surprisingly powerful.

Stack-based implementation

You can implement DFS iteratively using an explicit stack:

  1. Push the starting vertex onto the stack and mark it as visited
  2. Pop the top vertex from the stack and process it
  3. Push all of its unvisited neighbors onto the stack, marking them as visited
  4. Repeat steps 2–3 until the stack is empty

This approach avoids recursion, which is useful when the graph is so deep that recursive calls might exceed system stack limits.

Recursive implementation

The recursive version naturally mirrors DFS because each function call acts like pushing onto the call stack:

  1. Mark the current vertex as visited and process it
  2. For each unvisited neighbor, make a recursive call
  3. The base case is simply: if a vertex is already visited, return

The code tends to be shorter and more readable, but very deep graphs can cause a stack overflow.

Pre-order vs post-order

These terms describe when you process a vertex relative to its descendants:

  • Pre-order processes a vertex before exploring its children. This is useful for copying a tree structure or generating a topological ordering of a directed acyclic graph (DAG).
  • Post-order processes a vertex after all its children have been explored. This is the right choice for tasks like evaluating expression trees or safely deleting nodes.

Time and space complexity

  • Time: O(V+E)O(V + E) where VV is the number of vertices and EE is the number of edges. Every vertex and edge gets examined exactly once in a connected graph.
  • Space: O(V)O(V) in the worst case for the stack (explicit or recursive). For trees specifically, this becomes O(h)O(h) where hh is the tree's height.

BFS explores a graph level by level, which makes it the natural choice whenever you care about distance from a starting point.

Queue-based implementation

Here's how BFS works step by step:

  1. Enqueue the starting vertex and mark it as visited
  2. Dequeue a vertex from the front of the queue and process it
  3. Enqueue all of its unvisited neighbors, marking each as visited
  4. Repeat steps 2–3 until the queue is empty

Because the queue is first-in-first-out, vertices are always processed in order of their distance from the start.

Level-order traversal

BFS naturally performs a level-order traversal: it visits all vertices at distance 0 (the start), then distance 1, then distance 2, and so on. This property makes BFS ideal for:

  • Finding the minimum number of edges between two vertices
  • Problems that require processing nodes level by level (like printing a tree by depth)
  • Building a breadth-first tree that encodes shortest paths from the source

You can track which "level" you're on by recording the queue size at the start of each round.

Time and space complexity

  • Time: O(V+E)O(V + E), same as DFS. Each vertex and edge is visited once.
  • Space: O(V)O(V) for the queue in the worst case. For trees, this becomes O(w)O(w) where ww is the maximum width of the tree. Wide, shallow graphs can make BFS memory-hungry.
Depth-first search (DFS), Depth-first search - Wikipedia

Applications of traversals

Graph traversals aren't just academic exercises. They're the building blocks for solving real, complex problems.

Pathfinding algorithms

  • DFS works well for maze solving and generating game decision trees, though it won't necessarily find the shortest path.
  • BFS guarantees the shortest path in unweighted graphs, making it a go-to for simple routing problems.
  • Dijkstra's algorithm extends BFS-like exploration to weighted graphs, always expanding the lowest-cost vertex next.
  • A* search adds a heuristic estimate to Dijkstra's approach, making pathfinding faster when you have a good guess of the remaining distance. GPS navigation and game AI rely on this heavily.

Topological sorting

Topological sorting applies DFS to a directed acyclic graph (DAG) and produces a linear ordering of vertices where for every directed edge (u,v)(u, v), vertex uu appears before vv.

This is used in task scheduling (which tasks must finish before others can start), build systems (compile dependencies in the right order), and package managers. As a bonus, if DFS encounters a back edge during this process, it means the graph has a cycle and no valid topological order exists. Time complexity: O(V+E)O(V + E).

Connected components

BFS or DFS can identify connected components in undirected graphs by starting a traversal from an unvisited vertex and marking everything reachable as one component, then repeating for any remaining unvisited vertices.

For directed graphs, Kosaraju's algorithm finds strongly connected components (groups of vertices where every vertex can reach every other vertex in the group). Applications include social network community detection and image segmentation.

Cycle detection

DFS detects cycles by tracking which vertices are currently on the recursion stack (sometimes called "gray" vertices in a color-coding scheme):

  • In a directed graph, a back edge (an edge pointing to a vertex still on the recursion stack) indicates a cycle.
  • In an undirected graph, revisiting a vertex that isn't the immediate parent of the current vertex indicates a cycle.

This is important for verifying whether a graph is a tree (connected and acyclic) or a DAG, and for deadlock detection in operating systems.

Graph representation

How you store a graph in memory directly affects how efficiently traversals run. There are three common representations, each with different trade-offs.

Adjacency matrix

A 2D array where matrix[i][j] equals 1 (or a weight) if there's an edge between vertices ii and jj, and 0 otherwise.

  • Edge lookup: O(1)O(1), just check one array cell
  • Space: O(V2)O(V^2), which wastes memory on sparse graphs
  • Best for: Dense graphs or algorithms like Floyd-Warshall that need fast edge-weight access

Adjacency list

An array of lists where each entry contains the neighbors of that vertex.

  • Space: O(V+E)O(V + E), much more efficient for sparse graphs
  • Neighbor iteration: Fast, since you only loop through actual neighbors
  • Best for: Most practical applications. This is the default choice for DFS and BFS implementations.

Edge list

A simple list of all edges, each stored as a pair (or triple, if weighted) of vertices.

  • Space: O(E)O(E), very compact
  • Best for: Algorithms that process edges directly, like Kruskal's minimum spanning tree algorithm
  • Drawback: Checking whether a specific edge exists or finding all neighbors of a vertex requires scanning the entire list, making it slow for traversals

Traversal strategies

These advanced strategies build on DFS and BFS to handle more complex search scenarios.

Iterative deepening

Iterative deepening depth-first search (IDDFS) combines the space efficiency of DFS with the completeness of BFS:

  1. Perform DFS with a depth limit of 0 (only the start node)
  2. If the goal isn't found, increase the depth limit by 1 and repeat
  3. Continue until the goal is found

This guarantees finding the shallowest solution (like BFS) while using only O(bd)O(bd) memory (like DFS), where bb is the branching factor and dd is the solution depth. The repeated work seems wasteful, but most of the effort is concentrated in the deepest level, so the overhead is modest. Time complexity: O(bd)O(b^d).

Instead of searching only from the start, this strategy searches simultaneously from the start and from the goal, stopping when the two searches meet.

  • Can dramatically reduce the search space. If a unidirectional BFS explores O(bd)O(b^d) nodes, bidirectional search explores roughly O(bd/2)O(b^{d/2}) from each side.
  • Requires that you know the goal state in advance and can search backward from it.
  • The tricky part is correctly detecting when the two frontiers overlap and confirming the combined path is optimal.
Depth-first search (DFS), CS 360: Lecture 17: Depth-First Search

A search algorithm

A* uses a priority queue and evaluates each node by f(n)=g(n)+h(n)f(n) = g(n) + h(n), where g(n)g(n) is the actual cost from the start and h(n)h(n) is a heuristic estimate of the remaining cost to the goal.

  • If the heuristic is admissible (never overestimates) and consistent (satisfies the triangle inequality), A* guarantees finding the optimal path.
  • With a perfect heuristic, A* goes straight to the goal. With no heuristic (h=0h = 0), it reduces to Dijkstra's algorithm.
  • Widely used in game AI and robotics for efficient pathfinding.

Time complexity analysis

Understanding when traversals are fast or slow helps you pick the right tool for the job.

Best-case scenarios

  • For DFS and BFS, if the target happens to be the starting vertex or an immediate neighbor, you find it in O(1)O(1).
  • A* performs best when the heuristic closely matches the true cost, allowing it to explore very few nodes.
  • These best cases are rarely representative of typical performance.

Worst-case scenarios

  • Both DFS and BFS have O(V+E)O(V + E) worst-case time, which occurs when you must traverse the entire graph (e.g., the target doesn't exist).
  • A* degrades to Dijkstra's algorithm with a poor heuristic, giving O(E+VlogV)O(E + V \log V).
  • Iterative deepening: O(bd)O(b^d) where bb is the branching factor and dd is the solution depth.

Average-case complexity

Average-case performance depends heavily on graph structure. For random graphs, DFS and BFS both average O(V+E)O(V + E). A*'s average performance is dominated by heuristic quality: a good heuristic can make it orders of magnitude faster than Dijkstra's, while a bad one provides little benefit. Bidirectional search typically outperforms unidirectional search on average because it explores less of the graph.

Space complexity considerations

Memory can be the bottleneck, especially with large graphs. Knowing the space requirements of each algorithm helps you avoid running out of memory.

Memory usage in DFS

  • Recursive DFS uses O(h)O(h) space where hh is the maximum recursion depth. For a balanced tree, that's O(logn)O(\log n); for a skewed tree, it's O(n)O(n).
  • Iterative DFS with an explicit stack uses O(V)O(V) in the worst case.
  • You can reduce overhead by using bit vectors instead of hash sets to track visited vertices.
  • For very large graphs, iterative deepening keeps memory usage low while still being complete.

Memory usage in BFS

  • BFS requires O(V)O(V) space for the queue in the worst case. In practice, the queue holds all vertices at the current frontier, so wide graphs are expensive.
  • For trees, space is O(w)O(w) where ww is the maximum width. A complete binary tree's widest level has roughly half of all nodes, so BFS on such a tree uses much more memory than DFS.
  • Techniques like frontier search can reduce memory by only storing the current and next levels.

Trade-offs between time and space

DFS saves memory but may not find the shortest path. BFS finds the shortest path but uses more memory. Iterative deepening trades redundant computation for DFS-level memory usage with BFS-level completeness. A* balances both, but its memory usage grows with heuristic quality and graph size.

The right choice depends on your constraints: if memory is tight, lean toward DFS or iterative deepening. If you need shortest paths, BFS or A* is the way to go.

Graph traversal in practice

Real-world applications

  • Social networks: Friend recommendations (BFS to find people within a few connections) and influence mapping
  • Web crawling: Search engines use BFS-like strategies to discover and index web pages
  • Network routing: Internet protocols use shortest-path algorithms derived from BFS and Dijkstra's
  • Build systems: Topological sorting (via DFS) determines the correct compilation order for dependencies
  • Game AI and GPS: A* search powers pathfinding in video games and navigation apps

Optimization techniques

  • Use bit vectors for visited-node tracking when vertices are numbered sequentially; they're faster and more memory-efficient than hash sets.
  • Prefer iterative implementations over recursive ones for large graphs to avoid stack overflow.
  • Apply early termination when you're searching for a specific target rather than exploring the whole graph.
  • Use domain-specific heuristics with A* to dramatically reduce the number of nodes explored.

Parallel and distributed traversals

For graphs too large to fit on one machine (like the web graph or large social networks), traversals can be distributed:

  • Divide the graph into subgraphs and process them on separate machines or threads
  • Map-reduce frameworks handle algorithms like PageRank, which is essentially repeated BFS-like propagation
  • Load balancing through work-stealing algorithms keeps all processors busy
  • The main challenge is communication overhead: vertices on partition boundaries require coordination between machines