Definition of stationary distributions
A stationary distribution is a probability distribution over the states of a Markov chain that remains unchanged when you apply the transition matrix. Think of it as the equilibrium of the system: once the chain settles into this distribution, it stays there forever.
Formally, a stationary distribution is a row vector satisfying two conditions:
- , where is the transition probability matrix
- and for all
The entries of tell you the long-run proportion of time the chain spends in each state. If a chain has states and , then in the long run the chain spends 50% of its time in state 2.
Properties of stationary distributions
Invariance under state transitions
The defining property of a stationary distribution is that it's a fixed point of the transition dynamics. If the chain's state distribution at time is , then at time it's , and by induction:
This means that if you start the chain with initial distribution , the marginal distribution at every future time step is still . The chain looks statistically identical at every point in time.
Uniqueness of stationary distributions
A stationary distribution is not always unique. A reducible chain (one with multiple closed communicating classes) can have infinitely many stationary distributions, since you can form convex combinations of the stationary distributions within each closed class.
Uniqueness is guaranteed when the chain is irreducible (all states communicate). In that case, if a stationary distribution exists, there is exactly one. This is a direct consequence of the Perron-Frobenius theorem applied to stochastic matrices. The practical upshot: for an irreducible chain, the long-run behavior doesn't depend on where you started.
Existence of stationary distributions
Three concepts control whether a stationary distribution exists: irreducibility, aperiodicity, and positive recurrence. They play different roles, so it's worth being precise about which condition does what.
Irreducible Markov chains
A Markov chain is irreducible if every state can be reached from every other state in some finite number of steps. Formally, for all states , there exists such that .
Irreducibility alone does not guarantee a stationary distribution exists. Consider a symmetric random walk on : it's irreducible, but no stationary distribution exists because the state space is infinite and the chain is null recurrent. What irreducibility does guarantee is that if a stationary distribution exists, it's unique.
Positive recurrence
A state is positive recurrent if the expected return time is finite, where is the first return time to state . In an irreducible chain, either all states are positive recurrent or none are.
Positive recurrence is the condition that actually ensures existence. For an irreducible, positive recurrent chain, the unique stationary distribution is given by:
This formula connects the stationary probability of a state directly to how quickly the chain returns to it. States you visit more frequently get higher stationary probability.
For finite-state irreducible chains, positive recurrence is automatic (you always return in finite expected time), so a stationary distribution always exists.

Aperiodicity
A state has period if returns to that state can only happen at times that are multiples of , and is the largest such integer. A state is aperiodic if . In an irreducible chain, all states share the same period.
Aperiodicity is not required for the stationary distribution to exist. A periodic, irreducible, positive recurrent chain still has a unique stationary distribution. What aperiodicity controls is convergence: whether as . Without aperiodicity, the chain cycles and doesn't converge, even though still exists and still describes the time-average behavior.
To summarize: irreducible + positive recurrent unique stationary distribution exists. Adding aperiodicity further guarantees that converges to (the limiting distribution equals the stationary distribution).
Computation of stationary distributions
Solving the balance equations
To find directly, solve the linear system subject to .
-
Rewrite as , where is the identity matrix.
-
This gives you equations in unknowns, but they're linearly dependent (the rows of don't have full rank).
-
Drop any one of the equations and replace it with the normalization constraint .
-
Solve the resulting system.
Example: Suppose . The equation gives:
- Combined with
From the first equation: . Substituting into the normalization: , so and .
Detailed balance (reversibility shortcut)
If you can find satisfying the detailed balance equations:
then is automatically a stationary distribution (after normalizing). Detailed balance says the probability flow from to equals the flow from to . This is a stronger condition than the global balance equations, but when it holds, it often makes computation much easier since you can solve pairwise rather than simultaneously.
Power method for approximation
When the state space is large, solving the linear system directly can be expensive. The power method offers an iterative alternative:
- Choose any initial distribution .
- Compute at each iteration.
- Repeat until is below your desired tolerance.
For an irreducible, aperiodic chain, this converges to regardless of the starting distribution. The rate of convergence depends on the second-largest eigenvalue of in absolute value: the closer it is to 1, the slower convergence will be.
Limiting distributions vs. stationary distributions
These two concepts are related but not identical, and confusing them is a common mistake.
- A stationary distribution satisfies . It describes a fixed point.
- A limiting distribution is , where is some initial distribution. It describes convergence behavior.

When they coincide
If the chain is irreducible and aperiodic (and positive recurrent), then the limiting distribution exists and equals the unique stationary distribution, regardless of the initial distribution . That is:
When they differ
- Periodic chains: The stationary distribution exists, but oscillates rather than converging. The time-average , but the pointwise limit does not exist.
- Reducible chains: The limiting behavior depends on the starting state, and there may be multiple stationary distributions.
The ergodic theorem ties these together: for an irreducible, positive recurrent chain, the fraction of time spent in state converges to almost surely, even if the chain is periodic. Aperiodicity is only needed for the stronger statement about converging entry-wise.
Applications of stationary distributions
Queueing theory
In a stable queue (arrival rate less than service rate), the system reaches a steady state described by a stationary distribution. For the M/M/1 queue with arrival rate and service rate (where ), the stationary distribution over the number of customers in the system is geometric:
From this you can derive performance metrics like average queue length () and average waiting time via Little's law.
PageRank algorithm
Google's original PageRank models a "random surfer" clicking links as a Markov chain. States are web pages, and transitions follow outgoing links uniformly at random. To ensure irreducibility and aperiodicity, the model adds a damping factor (typically 0.85): at each step, with probability the surfer follows a link, and with probability they jump to a uniformly random page.
The stationary distribution of this modified chain assigns each page its PageRank score. Pages that are linked to by many important pages get higher stationary probability, which is why the algorithm captures a notion of "importance."
MCMC and statistical sampling
Markov Chain Monte Carlo methods (like the Metropolis-Hastings algorithm) construct a Markov chain whose stationary distribution is a target distribution you want to sample from. By running the chain long enough, the samples approximate draws from the target. The detailed balance condition is used to design the transition probabilities so that the desired distribution is indeed stationary.