🕸️Networked Life

Key Social Network Analysis Metrics

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Understanding network metrics isn't just about memorizing formulas—it's about grasping why certain nodes become powerful, how information spreads, and what makes some networks resilient while others fragment. You're being tested on your ability to identify which metric answers which question: Who's the most connected? Who controls information flow? How tightly clustered is a community? These distinctions matter because they reveal fundamentally different types of influence and network behavior.

The metrics in this guide fall into distinct conceptual categories: centrality measures (who matters and why), structural properties (how the network is organized), and efficiency measures (how well information travels). Don't just memorize that betweenness centrality involves shortest paths—know that it identifies brokers who can make or break communication across groups. When you can explain what each metric reveals about network dynamics, you're thinking like a network scientist.

Centrality Measures: Who Holds Power?

Centrality metrics answer the fundamental question: which nodes are most important? But "important" means different things depending on context. A node can be central because it has many friends, because it bridges communities, or because its friends are themselves influential. Understanding these distinctions is essential.

Degree Centrality

Counts direct connections—the simplest measure of how "popular" or connected a node is in the network
Hub identification relies on high degree centrality; these nodes are often first adopters who can spark viral spread
Limitation: treats all connections equally, missing whether those connections are themselves influential or peripheral

Betweenness Centrality

Measures brokerage power—how often a node sits on the shortest path between other pairs of nodes
Bridge nodes with high betweenness control information flow; removing them can disconnect communities entirely
Calculated as $C_B(v) = \sum_{s \neq v \neq t} \frac{\sigma_{st}(v)}{\sigma_{st}}$ where $\sigma_{st}$ is the total shortest paths between $s$ and $t$ , and $\sigma_{st}(v)$ counts those passing through $v$

Closeness Centrality

Measures reachability—the inverse of a node's average distance to all other nodes in the network
Low average distance means information originating from this node spreads quickly; ideal for broadcasting or rapid dissemination
Expressed as $C_C(v) = \frac{n-1}{\sum_{u \neq v} d(v,u)}$ where $d(v,u)$ is the shortest path length between nodes

Eigenvector Centrality

Quality over quantity—a node's importance depends on being connected to other important nodes, not just having many connections
Recursive definition means your centrality score is proportional to the sum of your neighbors' scores
Captures elite networks where being connected to the "right" people matters more than knowing many people

Compare: Degree vs. Eigenvector Centrality—both count connections, but eigenvector weights them by neighbor importance. A node with few connections to highly central nodes can outrank a node with many peripheral connections. If asked to identify "influential" nodes, clarify which type of influence the question targets.

Algorithmic Centrality: Computational Approaches

Some centrality measures emerged from computational problems rather than pure sociology. PageRank famously powered Google's search engine by treating the web as a network where link structure reveals importance.

PageRank

Random surfer model—imagines someone clicking links randomly; PageRank is the probability of landing on each node at equilibrium
Damping factor (typically $d = 0.85$ ) accounts for users jumping to random pages rather than always following links
Beyond web search, PageRank identifies influential accounts in social networks, important papers in citation networks, and key species in food webs

Compare: PageRank vs. Eigenvector Centrality—both consider connection quality, but PageRank adds the damping factor and handles directed networks naturally. PageRank also divides a node's influence among its outgoing links, so linking to many pages dilutes your "vote."

Local Structure: Clustering and Community

Not all network properties focus on individual nodes. Clustering and modularity reveal how nodes organize into groups—essential for understanding social cohesion, echo chambers, and community dynamics.

Clustering Coefficient

Measures triangle density—the fraction of a node's neighbors that are also connected to each other
High clustering indicates tightly-knit groups where "friends of friends are friends"; common in social networks but rare in random graphs
Local vs. global: individual nodes have clustering coefficients; the network average reveals overall tendency toward clique formation

Modularity

Quantifies community structure—compares actual within-group connections to what you'd expect in a random network with the same degree sequence
Modularity score $Q$ ranges from $-0.5$ to $1$ ; values above $0.3$ typically indicate significant community structure
Community detection algorithms optimize modularity to partition networks into meaningful groups

Compare: Clustering Coefficient vs. Modularity—clustering measures local "cliquishness" around individual nodes, while modularity assesses global division into distinct communities. A network can have high clustering but low modularity if triangles don't organize into separable groups.

Global Properties: Network-Wide Measures

These metrics describe the network as a whole rather than individual nodes. They answer questions about overall connectivity, efficiency, and structure.

Network Density

Proportion of possible edges that exist—calculated as $\frac{2m}{n(n-1)}$ for undirected networks with $m$ edges and $n$ nodes
Dense networks facilitate rapid information spread but require more maintenance; sparse networks are cheaper but may fragment easily
Real social networks are typically sparse; even Facebook's friendship network has density far below 1%

Average Path Length

Mean shortest distance between all pairs of reachable nodes in the network
Small-world property emerges when average path length is low despite high clustering—the "six degrees of separation" phenomenon
Efficiency indicator: lower values mean faster information diffusion and easier coordination across the network

Diameter

Maximum shortest path—the longest distance between any two connected nodes in the network
Worst-case scenario for information travel; reveals how "stretched out" a network can be
Sensitive to outliers: a single long chain of nodes can dramatically increase diameter without affecting most node pairs

Compare: Average Path Length vs. Diameter—average path length gives typical separation, while diameter captures the extreme. A network with low average path length but high diameter has most nodes close together but some isolated chains. Both matter for understanding information spread dynamics.

Quick Reference Table

Concept	Best Examples
Direct connectivity	Degree Centrality, Network Density
Brokerage and control	Betweenness Centrality
Reach and efficiency	Closeness Centrality, Average Path Length
Influence through connections	Eigenvector Centrality, PageRank
Local clustering	Clustering Coefficient
Community structure	Modularity
Network extremes	Diameter

Self-Check Questions

A node has relatively few connections but extremely high betweenness centrality. What role does this node likely play in the network, and why might removing it be particularly damaging?
Compare degree centrality and eigenvector centrality: under what circumstances would these two metrics rank the same node very differently?
A social network has high clustering coefficient but low modularity. What does this suggest about its structure? Describe what such a network might look like.
You're analyzing a network and find that average path length is very low (around 4) while diameter is very high (around 20). What does this tell you about the network's topology?
If you needed to identify the best node for rapidly spreading a message to the entire network, which centrality measure would you prioritize and why? How would your answer change if you instead wanted to identify nodes whose removal would most disrupt communication?

🕸️Networked Life

Key Social Network Analysis Metrics

Why This Matters

Centrality Measures: Who Holds Power?

Degree Centrality

Betweenness Centrality

Closeness Centrality

Eigenvector Centrality

Algorithmic Centrality: Computational Approaches

PageRank

Local Structure: Clustering and Community

Clustering Coefficient

Modularity

Global Properties: Network-Wide Measures

Network Density

Average Path Length

Diameter

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes