Fiveable

🔀Stochastic Processes Unit 6 Review

QR code for Stochastic Processes practice questions

6.1 Definition and properties of continuous-time Markov chains

6.1 Definition and properties of continuous-time Markov chains

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🔀Stochastic Processes
Unit & Topic Study Guides

Continuous-time Markov chains (CTMCs) model systems that transition between discrete states over continuous time, where the future depends only on the present state. They show up throughout queueing theory, reliability engineering, and biological modeling, providing a tractable framework for systems driven by random events.

This guide covers the formal definition of CTMCs, their key properties (memorylessness, stationarity, ergodicity), the generator matrix and Kolmogorov equations that govern their dynamics, state classification, first passage times, and common applications.

Definition of continuous-time Markov chains

A continuous-time Markov chain is a stochastic process {X(t),t0}\{X(t), t \geq 0\} with a discrete state space that evolves over continuous time. Transitions between states occur at random times, and the waiting time in each state follows an exponential distribution.

The defining feature is the Markov property: the future evolution of the process depends only on the current state, not on how the process arrived there. Formally, for all times s,t0s, t \geq 0 and states i,ji, j:

P(X(s+t)=jX(s)=i,{X(u):0u<s})=P(X(s+t)=jX(s)=i)P(X(s+t) = j \mid X(s) = i, \{X(u) : 0 \leq u < s\}) = P(X(s+t) = j \mid X(s) = i)

State space in CTMCs

The state space is the set of all possible states the system can occupy. For CTMCs, this space is discrete (finite or countably infinite).

  • Number of customers in a queue (states: 0, 1, 2, ...)
  • Operational status of a machine (states: working, under repair, failed)
  • Number of molecules of a given species in a chemical reaction

The choice of state space determines the dimension of the generator matrix and the complexity of the analysis.

Transition probabilities for CTMCs

Transition probabilities describe the likelihood of moving from one state to another over a given time interval. For a CTMC, the transition probability from state ii to state jj over time tt is:

Pij(t)=P(X(t)=jX(0)=i)P_{ij}(t) = P(X(t) = j \mid X(0) = i)

Because holding times are exponentially distributed, these probabilities inherit the memoryless property. The collection of all Pij(t)P_{ij}(t) values forms the transition probability matrix P(t)P(t), which satisfies P(0)=IP(0) = I (the identity matrix).

Chapman-Kolmogorov equations

The Chapman-Kolmogorov equations let you compute transition probabilities over longer intervals by breaking them into two shorter intervals. For any states ii, jj and times s,t0s, t \geq 0:

Pij(s+t)=kPik(s)Pkj(t)P_{ij}(s+t) = \sum_k P_{ik}(s) P_{kj}(t)

The idea: to go from ii to jj in time s+ts + t, the process must be in some intermediate state kk at time ss. You sum over all possible intermediate states. In matrix form, this is simply P(s+t)=P(s)P(t)P(s+t) = P(s)P(t), which is the semigroup property.

Time-homogeneous CTMCs

A CTMC is time-homogeneous if the transition probabilities depend only on the elapsed time, not on when you start observing:

P(X(s+t)=jX(s)=i)=P(X(t)=jX(0)=i)for all s0P(X(s+t) = j \mid X(s) = i) = P(X(t) = j \mid X(0) = i) \quad \text{for all } s \geq 0

Most CTMCs studied in this course are time-homogeneous. This assumption greatly simplifies the analysis because the generator matrix QQ is constant, and you can derive stationary distributions and long-run behavior without worrying about time-varying rates.

Properties of continuous-time Markov chains

CTMCs have several structural properties that make them analytically tractable and distinguish them from more general stochastic processes.

Memoryless property of exponential distributions

The exponential distribution governing holding times has a unique memoryless property: the probability of leaving the current state in the next tt units of time doesn't depend on how long you've already been there.

For an exponentially distributed random variable TT with rate λ\lambda:

P(T>s+tT>s)=P(T>t)for all s,t0P(T > s + t \mid T > s) = P(T > t) \quad \text{for all } s, t \geq 0

This is what makes the Markov property work in continuous time. If holding times were not memoryless, knowing how long the process had been in a state would give you information about when it would leave, violating the Markov property. The exponential distribution is the only continuous distribution with this property.

Stationary distribution of CTMCs

A stationary distribution π\pi is a probability distribution over the state space that remains unchanged as the process evolves. If the chain starts in distribution π\pi, it stays in distribution π\pi for all future times.

For an irreducible, positive recurrent CTMC, the stationary distribution exists, is unique, and satisfies:

πj=limtPij(t)for any initial state i\pi_j = \lim_{t \to \infty} P_{ij}(t) \quad \text{for any initial state } i

The stationary distribution can be found by solving πQ=0\pi Q = 0 subject to jπj=1\sum_j \pi_j = 1. It tells you the long-run fraction of time the process spends in each state.

Reversibility in Markov chains

A CTMC is reversible if the process looks statistically identical when run forward or backward in time. The condition for reversibility is the detailed balance equations:

πiqij=πjqjifor all states i,j\pi_i q_{ij} = \pi_j q_{ji} \quad \text{for all states } i, j

This says the probability flow from ii to jj equals the flow from jj to ii in stationarity. Reversibility is a stronger condition than simply having a stationary distribution. When it holds, you can often read off the stationary distribution directly from the rates without solving a full system of equations. Reversible chains appear frequently in queueing networks (Jackson networks) and statistical mechanics.

Ergodicity and limiting behavior

A CTMC is ergodic if it is irreducible and positive recurrent. For an ergodic chain:

  • A unique stationary distribution π\pi exists
  • The limiting distribution equals π\pi regardless of the initial state
  • Time averages converge to ensemble averages: the fraction of time spent in state jj converges to πj\pi_j

Ergodicity is the property that justifies using the stationary distribution to compute long-run performance measures like average queue length or system utilization.

Transition rate matrices

The transition rate matrix (or generator matrix) encodes all the infinitesimal transition rates of a CTMC in a single matrix. It's the continuous-time analogue of the transition probability matrix in discrete-time chains.

Generator matrix Q

The generator matrix QQ is a square matrix whose entries qijq_{ij} specify the instantaneous rate of transitioning from state ii to state jj. Its key structural property is that every row sums to zero:

qii=jiqijq_{ii} = -\sum_{j \neq i} q_{ij}

This zero-row-sum condition reflects conservation of probability: probability leaving state ii must go somewhere.

State space in CTMCs, Markov chain - Simple English Wikipedia, the free encyclopedia

Diagonal elements of Q

The diagonal entry qiiq_{ii} equals the negative of the total departure rate from state ii. Its absolute value qii|q_{ii}| is the rate parameter of the exponential holding time in state ii, so:

  • The expected time spent in state ii before transitioning is 1qii\frac{1}{|q_{ii}|}
  • A large qii|q_{ii}| means short visits to state ii; a small qii|q_{ii}| means long visits

Off-diagonal elements of Q

The off-diagonal entries qijq_{ij} (for iji \neq j) are the transition rates from state ii to state jj. They are always non-negative.

Given that the process is in state ii, the probability of the next transition being to state jj is:

qijqii=qijkiqik\frac{q_{ij}}{|q_{ii}|} = \frac{q_{ij}}{\sum_{k \neq i} q_{ik}}

This separates the CTMC into two components: how long you stay (governed by qii|q_{ii}|) and where you go next (governed by the ratios of off-diagonal entries). This decomposition is the basis of the embedded discrete-time Markov chain.

Relationship between Q and transition probabilities

The generator matrix QQ and the transition probability matrix P(t)P(t) are connected through differential equations and the matrix exponential.

From QQ to P(t)P(t): The transition probabilities satisfy the Kolmogorov forward equation ddtP(t)=P(t)Q\frac{d}{dt}P(t) = P(t)Q, and the solution for time-homogeneous chains is:

P(t)=etQP(t) = e^{tQ}

From P(t)P(t) to QQ: The generator can be recovered as:

Q=limt0P(t)ItQ = \lim_{t \to 0} \frac{P(t) - I}{t}

where II is the identity matrix. This is why QQ is called the infinitesimal generator: it captures the instantaneous rate of change of the transition probabilities.

Kolmogorov equations

The Kolmogorov equations are systems of ordinary differential equations that describe how transition probabilities evolve over time. They come in two forms, depending on whether you condition on the starting state or the ending state.

Kolmogorov forward equations

The forward equations (also called the master equations) are:

ddtPij(t)=kPik(t)qkj\frac{d}{dt}P_{ij}(t) = \sum_k P_{ik}(t) \, q_{kj}

In matrix form: P(t)=P(t)QP'(t) = P(t)Q.

The interpretation: the rate of change of Pij(t)P_{ij}(t) depends on the probability of being in each intermediate state kk at time tt, multiplied by the rate of transitioning from kk to jj. You're "looking forward" from the intermediate states to the destination.

Kolmogorov backward equations

The backward equations are:

ddtPij(t)=kqikPkj(t)\frac{d}{dt}P_{ij}(t) = \sum_k q_{ik} \, P_{kj}(t)

In matrix form: P(t)=QP(t)P'(t) = QP(t).

Here you're "looking backward" from the starting state: the rate of change depends on the rate of leaving state ii to each neighbor kk, multiplied by the probability of getting from kk to jj in the remaining time. The backward equations always have a unique solution, while the forward equations may require additional regularity conditions for uniqueness.

Solutions to Kolmogorov equations

For time-homogeneous CTMCs, both the forward and backward equations have the same solution:

P(t)=etQP(t) = e^{tQ}

Computing the matrix exponential can be done in several ways:

  • Eigenvalue decomposition: If Q=SΛS1Q = S \Lambda S^{-1}, then etQ=SetΛS1e^{tQ} = S \, e^{t\Lambda} \, S^{-1}, where etΛe^{t\Lambda} is diagonal with entries etλke^{t\lambda_k}
  • Laplace transforms: Transform the ODE system, solve algebraically, then invert
  • Direct series expansion: etQ=I+tQ+(tQ)22!+e^{tQ} = I + tQ + \frac{(tQ)^2}{2!} + \cdots (useful for small tt or theoretical arguments)

Numerical methods for Kolmogorov equations

When analytical solutions aren't feasible (large state spaces, complex structure), numerical methods are used:

  • Uniformization (Jensen's method): Convert the CTMC into a discrete-time chain by adding self-loops so all states have the same total departure rate, then weight the discrete-time transition probabilities by Poisson probabilities. This method is numerically stable and widely used.
  • Runge-Kutta methods: Standard ODE solvers applied to P(t)=P(t)QP'(t) = P(t)Q
  • Monte Carlo simulation: Generate sample paths and estimate transition probabilities from empirical frequencies

Uniformization is generally preferred for CTMCs because it avoids the numerical instability issues that can arise with stiff ODE solvers.

Classification of states

Classifying states reveals the long-term structure of a CTMC: which states communicate, which are visited infinitely often, and which are eventually abandoned.

Communicating classes in CTMCs

Two states ii and jj communicate (written iji \leftrightarrow j) if there is a positive probability of reaching jj from ii and reaching ii from jj, both in finite time.

A communicating class is a maximal set of states that all communicate with each other. The communicating classes partition the state space. A CTMC is irreducible if the entire state space forms a single communicating class (every state can reach every other state).

Absorbing states vs transient states

An absorbing state is one with no outgoing transitions to other states: once the process enters, it stays forever. In terms of the generator matrix, state ii is absorbing if qij=0q_{ij} = 0 for all jij \neq i (equivalently, qii=0q_{ii} = 0).

A transient state (in the context of absorbing chains) is a non-absorbing state from which the process will eventually reach an absorbing state with probability 1. In a CTMC with absorbing states, every sample path eventually gets trapped in some absorbing state.

State space in CTMCs, Markov Chains

Recurrent states vs transient states

  • A state is recurrent if, starting from that state, the process returns to it with probability 1
  • A state is transient if there is a positive probability of never returning to it

In a finite-state CTMC, every communicating class is either entirely recurrent or entirely transient. At least one communicating class must be recurrent. Transient states are visited only finitely many times before the process settles into a recurrent class.

Positive recurrent vs null recurrent states

Recurrent states are further divided:

  • Positive recurrent: The expected return time is finite. These states have well-defined stationary probabilities (πi>0\pi_i > 0).
  • Null recurrent: The process returns with probability 1, but the expected return time is infinite. These states have πi=0\pi_i = 0 and do not contribute to a proper stationary distribution.

Null recurrence can only occur in CTMCs with countably infinite state spaces. In any finite-state irreducible CTMC, all states are positive recurrent.

First passage times

First passage times measure how long it takes for a CTMC to reach a target state for the first time. They capture transient behavior and are central to reliability analysis, queueing delays, and performance evaluation.

Definition of first passage time

The first passage time from state ii to state jj is:

Tij=inf{t>0:X(t)=jX(0)=i}T_{ij} = \inf\{t > 0 : X(t) = j \mid X(0) = i\}

This is a random variable whose distribution depends on the structure of the chain between states ii and jj. When i=ji = j, TiiT_{ii} is called the return time to state ii.

Distribution of first passage times

The distribution of TijT_{ij} is generally not exponential (unless the transition is direct with no intermediate states). It can be characterized by:

  • The PDF fij(t)f_{ij}(t): the probability density of first arriving at jj at time tt
  • The CDF Fij(t)=P(Tijt)F_{ij}(t) = P(T_{ij} \leq t): the probability of reaching jj within time tt

For simple chains, the first passage time distribution can be a convolution of exponentials (a phase-type distribution). For more complex chains, Laplace transforms are typically used to characterize the distribution.

Expected first passage times

The expected first passage time E[Tij]E[T_{ij}] can be computed by solving a system of linear equations derived from the generator matrix. For iji \neq j:

1=kqikE[Tkj]with E[Tjj] set according to the boundary condition-1 = \sum_{k} q_{ik} \, E[T_{kj}] \quad \text{with } E[T_{jj}] \text{ set according to the boundary condition}

The 1-1 on the left comes from the fact that time passes at rate 1. These equations can also be set up using first-step analysis: condition on the first transition out of state ii and use the law of total expectation.

Relationship to hitting times

Hitting times generalize first passage times to sets of states rather than a single state. The hitting time of a set AA, starting from state ii, is:

Ti(A)=inf{t>0:X(t)AX(0)=i}T_i(A) = \inf\{t > 0 : X(t) \in A \mid X(0) = i\}

A first passage time TijT_{ij} is the special case where A={j}A = \{j\}. The same linear equation techniques used for first passage times extend directly to hitting times, with boundary conditions E[Ti(A)]=0E[T_i(A)] = 0 for iAi \in A.

Applications of continuous-time Markov chains

CTMCs provide a natural modeling framework for systems where events occur at random times governed by exponential (or approximately exponential) inter-event distributions.

Queueing systems as CTMCs

Queueing systems are one of the most common CTMC applications. Consider an M/M/1 queue (Poisson arrivals at rate λ\lambda, exponential service at rate μ\mu, one server):

  • The state is the number of customers in the system
  • Arrival transitions: state nn+1n \to n+1 at rate λ\lambda
  • Service transitions: state nn1n \to n-1 at rate μ\mu (for n1n \geq 1)

Performance measures like average queue length, waiting time, and server utilization follow directly from the stationary distribution, which exists when λ<μ\lambda < \mu.

Birth-death processes

Birth-death processes are CTMCs where transitions only occur between neighboring states: from state nn to n+1n+1 (birth, rate λn\lambda_n) or n1n-1 (death, rate μn\mu_n).

The stationary distribution (when it exists) has a product form:

πn=π0k=0n1λkμk+1\pi_n = \pi_0 \prod_{k=0}^{n-1} \frac{\lambda_k}{\mu_{k+1}}

where π0\pi_0 is determined by normalization. Birth-death processes model population dynamics, simple queues, and chemical reactions with single-molecule creation and destruction.

Reliability models using CTMCs

In reliability engineering, CTMCs model systems whose components fail and get repaired at exponential rates. A typical two-component system might have states:

  • Both working
  • Component A failed, B working
  • Component B failed, A working
  • Both failed

Transitions correspond to failures (at rate λ\lambda) and repairs (at rate μ\mu). From the CTMC, you can compute system availability (long-run fraction of time the system is operational), mean time to failure (expected first passage time to a failed state), and mean time to repair.

Markov decision processes in continuous time

Continuous-time Markov decision processes (CTMDPs) add a control layer on top of CTMCs. At each state, a decision-maker chooses an action that affects the transition rates and earns rewards.

The goal is to find a policy (a rule for choosing actions) that maximizes expected total discounted reward or long-run average reward. The underlying state dynamics are still governed by a CTMC, but the generator matrix QQ depends on the chosen actions. Solutions typically involve solving Hamilton-Jacobi-Bellman equations or using uniformization to convert the problem into a discrete-time MDP.