Graphs represent relationships between objects using vertices (also called nodes) and edges (connections between them). Different types of graphs capture different kinds of relationships, and knowing which type you're working with determines which representations and algorithms apply.

Directed vs undirected graphs

A directed graph (or digraph) has edges that point in one direction. Think of web page links: page A can link to page B without B linking back. You draw these with arrows.

An undirected graph has edges that go both ways. A Facebook friendship is a good example: if you're friends with someone, they're friends with you too. You draw these with simple lines.

This distinction matters because it changes how you store and traverse the graph. In a directed graph's adjacency matrix, the entry at row $i$ , column $j$ can differ from the entry at row $j$ , column $i$ . In an undirected graph, the matrix is always symmetric.

Weighted vs unweighted graphs

Weighted graphs assign a numerical value to each edge. These weights might represent distances between cities, costs of network connections, or travel times. Weights can be positive, negative, or zero depending on the problem.

Unweighted graphs treat every edge the same. They only care about whether a connection exists, not how "strong" or "costly" it is.

The distinction drives algorithm choice. Shortest-path problems on weighted graphs need algorithms like Dijkstra's, while on unweighted graphs a simple BFS will do.

Simple vs multigraphs

A simple graph has at most one edge between any pair of vertices and no self-loops (edges from a vertex to itself).
A multigraph allows multiple edges between the same pair of vertices and may include self-loops.

Most problems in this course deal with simple graphs. Multigraphs show up when you need to model situations like multiple flight routes between two cities or parallel roads connecting two intersections.

Graph representations

Once you've identified what type of graph you're working with, you need a way to store it. Each representation below makes different trade-offs between memory usage and how fast you can perform common operations.

Adjacency matrix

A two-dimensional $V \times V$ array where $V$ is the number of vertices. The entry $a_{ij}$ equals 1 if there's an edge from vertex $i$ to vertex $j$ , and 0 otherwise. For weighted graphs, you store the weight instead of 1.

Example: For a 4-vertex undirected graph where vertex 0 connects to vertices 1 and 2, and vertex 1 connects to vertex 3:

	0	1	2	3
0	0	1	1	0
1	1	0	0	1
2	1	0	0	0
3	0	1	0	0

Space: $O(V^2)$ regardless of how many edges exist
Best for: Dense graphs (many edges) and when you need instant edge-existence checks in $O(1)$
Drawback: Wastes a lot of space on sparse graphs

Adjacency list

An array of lists where each index corresponds to a vertex, and the list at that index contains all of that vertex's neighbors. For weighted graphs, each entry stores a (neighbor, weight) pair.

Example (same graph as above):

0 → [1, 2]
1 → [0, 3]
2 → [0]
3 → [1]
Space: $O(V + E)$ where $E$ is the number of edges
Best for: Sparse graphs and algorithms that need to iterate over a vertex's neighbors (like BFS and DFS)
Drawback: Checking whether a specific edge exists takes $O(\text{degree}(v))$ time, not constant time

Edge list

A flat list of all edges. Each entry is a pair (or triple for weighted graphs) like $(u, v)$ or $(u, v, w)$ .

Example: [(0,1), (0,2), (1,3)]

Space: $O(E)$
Best for: Algorithms that process edges one at a time, like Kruskal's minimum spanning tree algorithm
Drawback: Finding a specific edge or listing a vertex's neighbors requires scanning the entire list

Incidence matrix

A $V \times E$ matrix where rows are vertices and columns are edges. An entry is 1 if that vertex is an endpoint of that edge, 0 otherwise. For directed graphs, use 1 for the tail (outgoing) and $-1$ for the head (incoming).

Space: $O(VE)$ , which can get very large
Best for: Analyzing vertex-edge relationships in theoretical contexts
Drawback: Rarely used in practice because of its size

Properties of representations

Space complexity

Representation	Space	Best When...
Adjacency matrix	$O(V^2)$	Graph is dense
Adjacency list	$O(V + E)$	Graph is sparse
Edge list	$O(E)$	You only need to iterate edges
Incidence matrix	$O(VE)$	Theoretical analysis

A graph is considered sparse when $E$ is much less than $V^2$ , and dense when $E$ approaches $V^2$ .

Time complexity for operations

Operation	Adjacency Matrix	Adjacency List
Check edge exists	$O(1)$	$O(\text{degree}(v))$
Add an edge	$O(1)$	$O(1)$
Remove an edge	$O(1)$	$O(\text{degree}(v))$
Find all neighbors	$O(V)$	$O(\text{degree}(v))$

Notice the pattern: the adjacency matrix gives you constant-time edge checks but forces you to scan an entire row to find neighbors. The adjacency list gives you fast neighbor access but slower edge lookups.

Suitability for different algorithms

BFS and DFS work best with adjacency lists, since they need to visit all neighbors of each vertex.
Floyd-Warshall (all-pairs shortest paths) naturally fits the adjacency matrix, since it updates entries in a $V \times V$ table.
Kruskal's algorithm works efficiently with an edge list, since it sorts and processes edges one by one.

Conversion between representations

Matrix to list conversion

Create an empty list for each vertex.
Walk through each row $i$ of the matrix.
For every column $j$ where $a_{ij} \neq 0$ , add vertex $j$ to vertex $i$ 's neighbor list.

Time complexity: $O(V^2)$ because you must examine every entry in the matrix.

List to matrix conversion

Create a $V \times V$ matrix initialized to all zeros.
For each vertex $i$ , iterate through its neighbor list.
For each neighbor $j$ , set $a_{ij} = 1$ (or the edge weight).

Time complexity: $O(V + E)$ since you visit each vertex and each edge once.

Efficiency considerations

Converting from matrix to list can save significant space when the graph is sparse. Converting from list to matrix makes sense when your algorithm needs many fast edge-existence checks. Always weigh whether the conversion cost is worth the performance gain for the operations you'll be doing afterward.

Special graph structures

Certain graph structures have properties that make them easier to work with or that require specialized approaches.

Trees and forests

A tree is a connected graph with no cycles. A tree with $V$ vertices always has exactly $V - 1$ edges. Between any two vertices in a tree, there's exactly one path.

A forest is a collection of disjoint trees (a graph with no cycles that may not be connected).

Trees show up everywhere: file systems, organizational hierarchies, decision trees, and as substructures in many graph algorithms.

Bipartite graphs

A bipartite graph has vertices that can be split into two groups so that every edge connects a vertex in one group to a vertex in the other. No edges exist within the same group.

The key test: a graph is bipartite if and only if it contains no odd-length cycles. You can check this with a BFS-based two-coloring algorithm.

Bipartite graphs model matching problems like assigning workers to jobs or students to project groups.

Complete graphs

A complete graph $K_n$ connects every pair of its $n$ vertices with exactly one edge. The total number of edges is $\frac{n(n-1)}{2}$ .

For example, $K_5$ has $\frac{5 \times 4}{2} = 10$ edges. Complete graphs represent the densest possible simple graphs and often serve as worst-case inputs for algorithm analysis.

Applications in mathematics

Graph theory problems

Vertex coloring asks: what's the minimum number of colors needed so no two adjacent vertices share a color? That minimum is the graph's chromatic number.
Hamiltonian cycle asks whether there's a cycle that visits every vertex exactly once.
Graph isomorphism asks whether two graphs have the same structure, even if their vertices are labeled differently.

These problems connect to real-world tasks like exam scheduling (coloring), route planning (Hamiltonian cycles), and pattern recognition (isomorphism).

Network analysis

Centrality measures (degree, betweenness, closeness) identify the most important or influential vertices in a network.
Community detection algorithms find clusters of tightly connected vertices.
Flow networks model how much "stuff" (data, traffic, fluid) can move through a system from source to sink.

Optimization problems

The Traveling Salesman Problem (TSP) seeks the shortest route visiting all vertices and returning to the start.
The maximum flow problem finds the greatest possible flow from a source to a sink in a weighted directed graph.
Minimum spanning tree finds the subset of edges that connects all vertices with the lowest total weight.

These problems have direct applications in logistics, network design, and resource allocation.

Algorithmic considerations

Traversal algorithms

Depth-First Search (DFS) goes as deep as possible along one branch before backtracking. It uses a stack (or recursion).

Breadth-First Search (BFS) visits all neighbors of the current vertex before moving to the next level. It uses a queue.

Both run in $O(V + E)$ time with an adjacency list. BFS also finds shortest paths in unweighted graphs, which DFS does not.

Shortest path algorithms

Algorithm	Handles Negative Weights?	Finds	Time Complexity
Dijkstra's	No	Single-source	$O((V + E) \log V)$
Bellman-Ford	Yes	Single-source	$O(VE)$
Floyd-Warshall	Yes	All pairs	$O(V^3)$

Pick Dijkstra's when all weights are non-negative (it's the fastest). Use Bellman-Ford when negative weights exist. Use Floyd-Warshall when you need shortest paths between every pair of vertices.

Minimum spanning tree

Kruskal's algorithm: Sort all edges by weight, then add them one by one (skipping any that would create a cycle). Works well with an edge list.
Prim's algorithm: Start from any vertex and repeatedly add the cheapest edge connecting the tree to a new vertex. Works well with an adjacency list or matrix.

Both produce optimal results for connected, undirected, weighted graphs.

Data structures for graphs

Arrays and matrices

Static arrays work well for adjacency matrices when the graph size is known and fixed. Dynamic arrays (like Python lists or C++ vectors) allow the graph to grow. Matrices give constant-time edge lookups but waste space when the graph is sparse.

Linked lists

Linked lists are a natural fit for adjacency lists. They handle variable-length neighbor lists and allow efficient edge insertion and deletion. The trade-off is extra memory for storing pointers and slightly slower access compared to arrays.

Hash tables for graphs

Hash tables (like Python dictionaries or hash maps) can store adjacency information with $O(1)$ average-time lookups. They're particularly useful for large, sparse graphs where you need fast edge queries but don't want to allocate a full $V \times V$ matrix.

Graph visualization

Layout algorithms

Force-directed layouts simulate vertices as charged particles that repel each other while edges act as springs. This tends to produce clean, readable layouts.
Hierarchical layouts arrange vertices in levels, which works well for trees and directed acyclic graphs (DAGs).
Circular layouts place vertices evenly around a circle, making cycle structures easy to spot.

All layout algorithms try to minimize edge crossings and reveal the graph's structure visually.

Tools for graph drawing

NetworkX (Python): great for both graph computation and basic visualization
Gephi: standalone software for advanced graph analysis and rendering
D3.js: JavaScript library for interactive, web-based graph visualizations

Interpretation of visual representations

When reading a graph visualization, pay attention to how visual properties encode data. Node size or color often represents centrality or category. Edge thickness or color can indicate weight. Clusters of tightly connected nodes suggest community structure. Interactive tools let you zoom and filter, which is essential for exploring large graphs.

Advanced topics

Hypergraphs

A hypergraph generalizes the idea of a graph by allowing a single edge (called a hyperedge) to connect any number of vertices, not just two. This is useful for modeling relationships that involve more than two entities at once, like a chemical reaction involving multiple molecules or a database record linking several fields.

Dynamic graphs

Dynamic graphs change over time as vertices and edges are added or removed. Social networks are a classic example: people join, leave, and form new connections constantly. The challenge is maintaining useful properties (like shortest paths or connectivity) efficiently as the graph evolves, rather than recomputing everything from scratch.

Graph databases

Graph databases like Neo4j and Amazon Neptune are designed specifically for storing and querying graph-structured data. They're optimized for traversal operations and pattern matching, making them well-suited for social networks, recommendation systems, and knowledge graphs where relationships between entities are the primary focus.