Types of graph traversals
Graph traversals are systematic methods for visiting every vertex in a graph. They're essential because many problems in computer science and math reduce to "explore a graph and find something." The two core approaches are depth-first search (DFS) and breadth-first search (BFS), and they differ in the order they visit vertices.
Depth-first search (DFS)
DFS explores as far as possible along each branch before backtracking. Think of it like navigating a maze by always taking the next unexplored turn and only retracing your steps when you hit a dead end.
- Uses a stack data structure (either explicitly or implicitly through recursion)
- Prioritizes depth over breadth, reaching leaf nodes quickly
- Useful for topological sorting, cycle detection, and exploring tree-like structures
- Can be implemented recursively or iteratively
Breadth-first search (BFS)
BFS explores all vertices at the current depth before moving to the next depth level. Picture dropping a stone in water: the ripples spread outward evenly in all directions.
- Uses a queue data structure to manage exploration order
- Guarantees finding the shortest path in unweighted graphs (fewest edges)
- Effective for level-order traversals and distance calculations
- Typically implemented iteratively
Comparison of DFS vs BFS
| Feature | DFS | BFS |
|---|---|---|
| Data structure | Stack | Queue |
| Memory (deep graphs) | Generally less | Generally more |
| Shortest path (unweighted) | Not guaranteed | Guaranteed |
| Implementation | Often recursive | Typically iterative |
| Best for | Exploring deep structures, backtracking | Finding nearby nodes, shortest paths |
The choice between them depends on what you need. If you want the shortest path in an unweighted graph, use BFS. If you need to explore exhaustively or detect cycles, DFS is often simpler.
Depth-first search
DFS is a cornerstone algorithm that demonstrates recursive thinking. Its logic is simple: go deeper whenever you can, backtrack when you must. That simplicity makes it surprisingly powerful.
Stack-based implementation
You can implement DFS iteratively using an explicit stack:
- Push the starting vertex onto the stack and mark it as visited
- Pop the top vertex from the stack and process it
- Push all of its unvisited neighbors onto the stack, marking them as visited
- Repeat steps 2–3 until the stack is empty
This approach avoids recursion, which is useful when the graph is so deep that recursive calls might exceed system stack limits.
Recursive implementation
The recursive version naturally mirrors DFS because each function call acts like pushing onto the call stack:
- Mark the current vertex as visited and process it
- For each unvisited neighbor, make a recursive call
- The base case is simply: if a vertex is already visited, return
The code tends to be shorter and more readable, but very deep graphs can cause a stack overflow.
Pre-order vs post-order
These terms describe when you process a vertex relative to its descendants:
- Pre-order processes a vertex before exploring its children. This is useful for copying a tree structure or generating a topological ordering of a directed acyclic graph (DAG).
- Post-order processes a vertex after all its children have been explored. This is the right choice for tasks like evaluating expression trees or safely deleting nodes.
Time and space complexity
- Time: where is the number of vertices and is the number of edges. Every vertex and edge gets examined exactly once in a connected graph.
- Space: in the worst case for the stack (explicit or recursive). For trees specifically, this becomes where is the tree's height.
Breadth-first search
BFS explores a graph level by level, which makes it the natural choice whenever you care about distance from a starting point.
Queue-based implementation
Here's how BFS works step by step:
- Enqueue the starting vertex and mark it as visited
- Dequeue a vertex from the front of the queue and process it
- Enqueue all of its unvisited neighbors, marking each as visited
- Repeat steps 2–3 until the queue is empty
Because the queue is first-in-first-out, vertices are always processed in order of their distance from the start.
Level-order traversal
BFS naturally performs a level-order traversal: it visits all vertices at distance 0 (the start), then distance 1, then distance 2, and so on. This property makes BFS ideal for:
- Finding the minimum number of edges between two vertices
- Problems that require processing nodes level by level (like printing a tree by depth)
- Building a breadth-first tree that encodes shortest paths from the source
You can track which "level" you're on by recording the queue size at the start of each round.
Time and space complexity
- Time: , same as DFS. Each vertex and edge is visited once.
- Space: for the queue in the worst case. For trees, this becomes where is the maximum width of the tree. Wide, shallow graphs can make BFS memory-hungry.
Applications of traversals
Graph traversals aren't just academic exercises. They're the building blocks for solving real, complex problems.
Pathfinding algorithms
- DFS works well for maze solving and generating game decision trees, though it won't necessarily find the shortest path.
- BFS guarantees the shortest path in unweighted graphs, making it a go-to for simple routing problems.
- Dijkstra's algorithm extends BFS-like exploration to weighted graphs, always expanding the lowest-cost vertex next.
- A* search adds a heuristic estimate to Dijkstra's approach, making pathfinding faster when you have a good guess of the remaining distance. GPS navigation and game AI rely on this heavily.
Topological sorting
Topological sorting applies DFS to a directed acyclic graph (DAG) and produces a linear ordering of vertices where for every directed edge , vertex appears before .
This is used in task scheduling (which tasks must finish before others can start), build systems (compile dependencies in the right order), and package managers. As a bonus, if DFS encounters a back edge during this process, it means the graph has a cycle and no valid topological order exists. Time complexity: .
Connected components
BFS or DFS can identify connected components in undirected graphs by starting a traversal from an unvisited vertex and marking everything reachable as one component, then repeating for any remaining unvisited vertices.
For directed graphs, Kosaraju's algorithm finds strongly connected components (groups of vertices where every vertex can reach every other vertex in the group). Applications include social network community detection and image segmentation.
Cycle detection
DFS detects cycles by tracking which vertices are currently on the recursion stack (sometimes called "gray" vertices in a color-coding scheme):
- In a directed graph, a back edge (an edge pointing to a vertex still on the recursion stack) indicates a cycle.
- In an undirected graph, revisiting a vertex that isn't the immediate parent of the current vertex indicates a cycle.
This is important for verifying whether a graph is a tree (connected and acyclic) or a DAG, and for deadlock detection in operating systems.
Graph representation
How you store a graph in memory directly affects how efficiently traversals run. There are three common representations, each with different trade-offs.
Adjacency matrix
A 2D array where matrix[i][j] equals 1 (or a weight) if there's an edge between vertices and , and 0 otherwise.
- Edge lookup: , just check one array cell
- Space: , which wastes memory on sparse graphs
- Best for: Dense graphs or algorithms like Floyd-Warshall that need fast edge-weight access
Adjacency list
An array of lists where each entry contains the neighbors of that vertex.
- Space: , much more efficient for sparse graphs
- Neighbor iteration: Fast, since you only loop through actual neighbors
- Best for: Most practical applications. This is the default choice for DFS and BFS implementations.
Edge list
A simple list of all edges, each stored as a pair (or triple, if weighted) of vertices.
- Space: , very compact
- Best for: Algorithms that process edges directly, like Kruskal's minimum spanning tree algorithm
- Drawback: Checking whether a specific edge exists or finding all neighbors of a vertex requires scanning the entire list, making it slow for traversals
Traversal strategies
These advanced strategies build on DFS and BFS to handle more complex search scenarios.
Iterative deepening
Iterative deepening depth-first search (IDDFS) combines the space efficiency of DFS with the completeness of BFS:
- Perform DFS with a depth limit of 0 (only the start node)
- If the goal isn't found, increase the depth limit by 1 and repeat
- Continue until the goal is found
This guarantees finding the shallowest solution (like BFS) while using only memory (like DFS), where is the branching factor and is the solution depth. The repeated work seems wasteful, but most of the effort is concentrated in the deepest level, so the overhead is modest. Time complexity: .
Bidirectional search
Instead of searching only from the start, this strategy searches simultaneously from the start and from the goal, stopping when the two searches meet.
- Can dramatically reduce the search space. If a unidirectional BFS explores nodes, bidirectional search explores roughly from each side.
- Requires that you know the goal state in advance and can search backward from it.
- The tricky part is correctly detecting when the two frontiers overlap and confirming the combined path is optimal.
A search algorithm
A* uses a priority queue and evaluates each node by , where is the actual cost from the start and is a heuristic estimate of the remaining cost to the goal.
- If the heuristic is admissible (never overestimates) and consistent (satisfies the triangle inequality), A* guarantees finding the optimal path.
- With a perfect heuristic, A* goes straight to the goal. With no heuristic (), it reduces to Dijkstra's algorithm.
- Widely used in game AI and robotics for efficient pathfinding.
Time complexity analysis
Understanding when traversals are fast or slow helps you pick the right tool for the job.
Best-case scenarios
- For DFS and BFS, if the target happens to be the starting vertex or an immediate neighbor, you find it in .
- A* performs best when the heuristic closely matches the true cost, allowing it to explore very few nodes.
- These best cases are rarely representative of typical performance.
Worst-case scenarios
- Both DFS and BFS have worst-case time, which occurs when you must traverse the entire graph (e.g., the target doesn't exist).
- A* degrades to Dijkstra's algorithm with a poor heuristic, giving .
- Iterative deepening: where is the branching factor and is the solution depth.
Average-case complexity
Average-case performance depends heavily on graph structure. For random graphs, DFS and BFS both average . A*'s average performance is dominated by heuristic quality: a good heuristic can make it orders of magnitude faster than Dijkstra's, while a bad one provides little benefit. Bidirectional search typically outperforms unidirectional search on average because it explores less of the graph.
Space complexity considerations
Memory can be the bottleneck, especially with large graphs. Knowing the space requirements of each algorithm helps you avoid running out of memory.
Memory usage in DFS
- Recursive DFS uses space where is the maximum recursion depth. For a balanced tree, that's ; for a skewed tree, it's .
- Iterative DFS with an explicit stack uses in the worst case.
- You can reduce overhead by using bit vectors instead of hash sets to track visited vertices.
- For very large graphs, iterative deepening keeps memory usage low while still being complete.
Memory usage in BFS
- BFS requires space for the queue in the worst case. In practice, the queue holds all vertices at the current frontier, so wide graphs are expensive.
- For trees, space is where is the maximum width. A complete binary tree's widest level has roughly half of all nodes, so BFS on such a tree uses much more memory than DFS.
- Techniques like frontier search can reduce memory by only storing the current and next levels.
Trade-offs between time and space
DFS saves memory but may not find the shortest path. BFS finds the shortest path but uses more memory. Iterative deepening trades redundant computation for DFS-level memory usage with BFS-level completeness. A* balances both, but its memory usage grows with heuristic quality and graph size.
The right choice depends on your constraints: if memory is tight, lean toward DFS or iterative deepening. If you need shortest paths, BFS or A* is the way to go.
Graph traversal in practice
Real-world applications
- Social networks: Friend recommendations (BFS to find people within a few connections) and influence mapping
- Web crawling: Search engines use BFS-like strategies to discover and index web pages
- Network routing: Internet protocols use shortest-path algorithms derived from BFS and Dijkstra's
- Build systems: Topological sorting (via DFS) determines the correct compilation order for dependencies
- Game AI and GPS: A* search powers pathfinding in video games and navigation apps
Optimization techniques
- Use bit vectors for visited-node tracking when vertices are numbered sequentially; they're faster and more memory-efficient than hash sets.
- Prefer iterative implementations over recursive ones for large graphs to avoid stack overflow.
- Apply early termination when you're searching for a specific target rather than exploring the whole graph.
- Use domain-specific heuristics with A* to dramatically reduce the number of nodes explored.
Parallel and distributed traversals
For graphs too large to fit on one machine (like the web graph or large social networks), traversals can be distributed:
- Divide the graph into subgraphs and process them on separate machines or threads
- Map-reduce frameworks handle algorithms like PageRank, which is essentially repeated BFS-like propagation
- Load balancing through work-stealing algorithms keeps all processors busy
- The main challenge is communication overhead: vertices on partition boundaries require coordination between machines