Fiveable

🔀Stochastic Processes Unit 5 Review

QR code for Stochastic Processes practice questions

5.4 Stationary distributions

5.4 Stationary distributions

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🔀Stochastic Processes
Unit & Topic Study Guides

Definition of stationary distributions

A stationary distribution is a probability distribution over the states of a Markov chain that remains unchanged when you apply the transition matrix. Think of it as the equilibrium of the system: once the chain settles into this distribution, it stays there forever.

Formally, a stationary distribution is a row vector π=(π1,π2,,πn)\pi = (\pi_1, \pi_2, \dots, \pi_n) satisfying two conditions:

  • πP=π\pi P = \pi, where PP is the transition probability matrix
  • iπi=1\sum_i \pi_i = 1 and πi0\pi_i \geq 0 for all ii

The entries of π\pi tell you the long-run proportion of time the chain spends in each state. If a chain has states {1,2,3}\{1, 2, 3\} and π=(0.2,0.5,0.3)\pi = (0.2, 0.5, 0.3), then in the long run the chain spends 50% of its time in state 2.

Properties of stationary distributions

Invariance under state transitions

The defining property of a stationary distribution is that it's a fixed point of the transition dynamics. If the chain's state distribution at time tt is π\pi, then at time t+1t+1 it's πP=π\pi P = \pi, and by induction:

πPn=πfor all n0\pi P^n = \pi \quad \text{for all } n \geq 0

This means that if you start the chain with initial distribution π\pi, the marginal distribution at every future time step is still π\pi. The chain looks statistically identical at every point in time.

Uniqueness of stationary distributions

A stationary distribution is not always unique. A reducible chain (one with multiple closed communicating classes) can have infinitely many stationary distributions, since you can form convex combinations of the stationary distributions within each closed class.

Uniqueness is guaranteed when the chain is irreducible (all states communicate). In that case, if a stationary distribution exists, there is exactly one. This is a direct consequence of the Perron-Frobenius theorem applied to stochastic matrices. The practical upshot: for an irreducible chain, the long-run behavior doesn't depend on where you started.

Existence of stationary distributions

Three concepts control whether a stationary distribution exists: irreducibility, aperiodicity, and positive recurrence. They play different roles, so it's worth being precise about which condition does what.

Irreducible Markov chains

A Markov chain is irreducible if every state can be reached from every other state in some finite number of steps. Formally, for all states i,ji, j, there exists n>0n > 0 such that Pn(i,j)>0P^n(i,j) > 0.

Irreducibility alone does not guarantee a stationary distribution exists. Consider a symmetric random walk on Z\mathbb{Z}: it's irreducible, but no stationary distribution exists because the state space is infinite and the chain is null recurrent. What irreducibility does guarantee is that if a stationary distribution exists, it's unique.

Positive recurrence

A state ii is positive recurrent if the expected return time mi=E[TiX0=i]m_i = E[T_i \mid X_0 = i] is finite, where TiT_i is the first return time to state ii. In an irreducible chain, either all states are positive recurrent or none are.

Positive recurrence is the condition that actually ensures existence. For an irreducible, positive recurrent chain, the unique stationary distribution is given by:

πi=1mi\pi_i = \frac{1}{m_i}

This formula connects the stationary probability of a state directly to how quickly the chain returns to it. States you visit more frequently get higher stationary probability.

For finite-state irreducible chains, positive recurrence is automatic (you always return in finite expected time), so a stationary distribution always exists.

Invariance under state transitions, Hidden Markov Model

Aperiodicity

A state has period dd if returns to that state can only happen at times that are multiples of dd, and dd is the largest such integer. A state is aperiodic if d=1d = 1. In an irreducible chain, all states share the same period.

Aperiodicity is not required for the stationary distribution to exist. A periodic, irreducible, positive recurrent chain still has a unique stationary distribution. What aperiodicity controls is convergence: whether Pn(i,j)πjP^n(i,j) \to \pi_j as nn \to \infty. Without aperiodicity, the chain cycles and PnP^n doesn't converge, even though π\pi still exists and still describes the time-average behavior.

To summarize: irreducible + positive recurrent \Rightarrow unique stationary distribution exists. Adding aperiodicity further guarantees that PnP^n converges to π\pi (the limiting distribution equals the stationary distribution).

Computation of stationary distributions

Solving the balance equations

To find π\pi directly, solve the linear system πP=π\pi P = \pi subject to iπi=1\sum_i \pi_i = 1.

  1. Rewrite πP=π\pi P = \pi as π(PI)=0\pi(P - I) = 0, where II is the identity matrix.

  2. This gives you nn equations in nn unknowns, but they're linearly dependent (the rows of PIP - I don't have full rank).

  3. Drop any one of the nn equations and replace it with the normalization constraint iπi=1\sum_i \pi_i = 1.

  4. Solve the resulting n×nn \times n system.

Example: Suppose P=(0.70.30.40.6)P = \begin{pmatrix} 0.7 & 0.3 \\ 0.4 & 0.6 \end{pmatrix}. The equation πP=π\pi P = \pi gives:

  • 0.7π1+0.4π2=π1    0.3π1+0.4π2=00.7\pi_1 + 0.4\pi_2 = \pi_1 \implies -0.3\pi_1 + 0.4\pi_2 = 0
  • Combined with π1+π2=1\pi_1 + \pi_2 = 1

From the first equation: π2=0.30.4π1=0.75π1\pi_2 = \frac{0.3}{0.4}\pi_1 = 0.75\pi_1. Substituting into the normalization: π1+0.75π1=1\pi_1 + 0.75\pi_1 = 1, so π1=470.571\pi_1 = \frac{4}{7} \approx 0.571 and π2=370.429\pi_2 = \frac{3}{7} \approx 0.429.

Detailed balance (reversibility shortcut)

If you can find π\pi satisfying the detailed balance equations:

πiPij=πjPjifor all i,j\pi_i P_{ij} = \pi_j P_{ji} \quad \text{for all } i, j

then π\pi is automatically a stationary distribution (after normalizing). Detailed balance says the probability flow from ii to jj equals the flow from jj to ii. This is a stronger condition than the global balance equations, but when it holds, it often makes computation much easier since you can solve pairwise rather than simultaneously.

Power method for approximation

When the state space is large, solving the linear system directly can be expensive. The power method offers an iterative alternative:

  1. Choose any initial distribution π(0)\pi^{(0)}.
  2. Compute π(k+1)=π(k)P\pi^{(k+1)} = \pi^{(k)} P at each iteration.
  3. Repeat until π(k+1)π(k)\|\pi^{(k+1)} - \pi^{(k)}\| is below your desired tolerance.

For an irreducible, aperiodic chain, this converges to π\pi regardless of the starting distribution. The rate of convergence depends on the second-largest eigenvalue of PP in absolute value: the closer it is to 1, the slower convergence will be.

Limiting distributions vs. stationary distributions

These two concepts are related but not identical, and confusing them is a common mistake.

  • A stationary distribution π\pi satisfies πP=π\pi P = \pi. It describes a fixed point.
  • A limiting distribution is limnμPn\lim_{n \to \infty} \mu P^n, where μ\mu is some initial distribution. It describes convergence behavior.
Invariance under state transitions, How to draw state diagram for first order Markov chain for 10000bases from 2 chromosomes ...

When they coincide

If the chain is irreducible and aperiodic (and positive recurrent), then the limiting distribution exists and equals the unique stationary distribution, regardless of the initial distribution μ\mu. That is:

limnPn(i,j)=πjfor all i,j\lim_{n \to \infty} P^n(i,j) = \pi_j \quad \text{for all } i, j

When they differ

  • Periodic chains: The stationary distribution exists, but Pn(i,j)P^n(i,j) oscillates rather than converging. The time-average 1Nn=0N1Pn(i,j)πj\frac{1}{N}\sum_{n=0}^{N-1} P^n(i,j) \to \pi_j, but the pointwise limit does not exist.
  • Reducible chains: The limiting behavior depends on the starting state, and there may be multiple stationary distributions.

The ergodic theorem ties these together: for an irreducible, positive recurrent chain, the fraction of time spent in state jj converges to πj\pi_j almost surely, even if the chain is periodic. Aperiodicity is only needed for the stronger statement about PnP^n converging entry-wise.

Applications of stationary distributions

Queueing theory

In a stable queue (arrival rate less than service rate), the system reaches a steady state described by a stationary distribution. For the M/M/1 queue with arrival rate λ\lambda and service rate μ\mu (where ρ=λ/μ<1\rho = \lambda/\mu < 1), the stationary distribution over the number of customers nn in the system is geometric:

πn=(1ρ)ρn,n=0,1,2,\pi_n = (1 - \rho)\rho^n, \quad n = 0, 1, 2, \dots

From this you can derive performance metrics like average queue length (ρ/(1ρ)\rho/(1-\rho)) and average waiting time via Little's law.

PageRank algorithm

Google's original PageRank models a "random surfer" clicking links as a Markov chain. States are web pages, and transitions follow outgoing links uniformly at random. To ensure irreducibility and aperiodicity, the model adds a damping factor α\alpha (typically 0.85): at each step, with probability α\alpha the surfer follows a link, and with probability 1α1 - \alpha they jump to a uniformly random page.

The stationary distribution of this modified chain assigns each page its PageRank score. Pages that are linked to by many important pages get higher stationary probability, which is why the algorithm captures a notion of "importance."

MCMC and statistical sampling

Markov Chain Monte Carlo methods (like the Metropolis-Hastings algorithm) construct a Markov chain whose stationary distribution is a target distribution you want to sample from. By running the chain long enough, the samples approximate draws from the target. The detailed balance condition is used to design the transition probabilities so that the desired distribution is indeed stationary.