Fiveable

🔀Stochastic Processes Unit 10 Review

QR code for Stochastic Processes practice questions

10.1 Definition and properties of martingales

10.1 Definition and properties of martingales

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🔀Stochastic Processes
Unit & Topic Study Guides

Definition of martingales

A martingale is a stochastic process that models a "fair game": your best prediction of any future value, given everything you know so far, is simply the current value. This concept sits at the heart of modern probability theory and underpins major results in statistics, mathematical finance, and stochastic analysis.

Three conditions must hold for a process {Xt}\{X_t\} to be a martingale with respect to a filtration {Ft}\{\mathcal{F}_t\}:

  1. Adapted: XtX_t is Ft\mathcal{F}_t-measurable for every tt.
  2. Integrable: E[Xt]<E[|X_t|] < \infty for every tt.
  3. Martingale property: E[Xt+1Ft]=XtE[X_{t+1} \mid \mathcal{F}_t] = X_t for all tt.

If condition 3 is replaced with E[Xt+1Ft]XtE[X_{t+1} \mid \mathcal{F}_t] \leq X_t, the process is a supermartingale (unfavorable game). If \geq, it's a submartingale (favorable game).

Adapted stochastic processes

A process {Xt}\{X_t\} is adapted to a filtration {Ft}\{\mathcal{F}_t\} if XtX_t is Ft\mathcal{F}_t-measurable for each tt. In plain terms, the value XtX_t is determined entirely by information available at time tt. The process cannot "look ahead." This is not just a technicality; without adaptedness, conditioning on Ft\mathcal{F}_t wouldn't make sense, and the martingale property would be ill-defined.

Filtration in martingales

A filtration {Ft}t0\{\mathcal{F}_t\}_{t \geq 0} is a nested (increasing) sequence of σ\sigma-algebras:

F0F1F2\mathcal{F}_0 \subseteq \mathcal{F}_1 \subseteq \mathcal{F}_2 \subseteq \cdots

Each Ft\mathcal{F}_t encodes all information available up to time tt. As time advances, you accumulate more information, so the σ\sigma-algebras grow. A martingale is always defined with respect to a particular filtration. The most common choice is the natural filtration Ft=σ(X0,X1,,Xt)\mathcal{F}_t = \sigma(X_0, X_1, \ldots, X_t), generated by the process itself, but you can use any filtration to which the process is adapted.

Conditional expectation property

The core of the definition: for a martingale {Xt}\{X_t\},

E[Xt+1Ft]=Xtfor all tE[X_{t+1} \mid \mathcal{F}_t] = X_t \quad \text{for all } t

This says that no matter how much past data you have, your optimal forecast of the next value is the present value. By iterating this property (the tower property of conditional expectation), you get the more general statement:

E[XsFt]=Xtfor all s>tE[X_s \mid \mathcal{F}_t] = X_t \quad \text{for all } s > t

So the "fair game" property extends to any future horizon, not just one step ahead.

Properties of martingales

The martingale definition is simple, but it generates a rich collection of structural results. The properties below give you tools for bounding deviations, computing expectations at random times, and establishing long-run convergence.

Martingale differences

Define the martingale difference sequence by Dk=XkXk1D_k = X_k - X_{k-1}. The martingale property is equivalent to:

E[DkFk1]=0for all kE[D_{k} \mid \mathcal{F}_{k-1}] = 0 \quad \text{for all } k

Each increment is, on average, zero given the past. This formulation is often more convenient for proofs. A useful consequence: martingale differences are uncorrelated (though not necessarily independent), so Var(Xn)=Var(X0)+k=1nVar(Dk)\text{Var}(X_n) = \text{Var}(X_0) + \sum_{k=1}^n \text{Var}(D_k). Variance can only grow or stay constant, never shrink.

Martingale stopping times

A stopping time τ\tau with respect to {Ft}\{\mathcal{F}_t\} is a random variable taking values in {0,1,2,}{}\{0, 1, 2, \ldots\} \cup \{\infty\} such that the event {τ=t}\{\tau = t\} belongs to Ft\mathcal{F}_t for every tt. The decision to stop at time tt must depend only on information available at time tt, not on the future.

Examples:

  • The first time a random walk hits level aa: τ=inf{t:Xt=a}\tau = \inf\{t : X_t = a\}.
  • The first time a process exceeds a threshold: τ=inf{t:Xt>c}\tau = \inf\{t : X_t > c\}.

"I'll sell when the price next drops" is a valid stopping time. "I'll sell one step before the price peaks" is not, because it requires future knowledge.

Optional stopping theorem

If {Xt}\{X_t\} is a martingale and τ\tau is a stopping time, you might hope that E[Xτ]=E[X0]E[X_\tau] = E[X_0]. This holds under sufficient conditions, but not in general. The standard version:

If τ\tau is almost surely bounded (i.e., τN\tau \leq N a.s. for some constant NN), then E[Xτ]=E[X0]E[X_\tau] = E[X_0].

Weaker conditions also work. For instance, if E[τ]<E[\tau] < \infty and the differences are bounded, the conclusion still holds. The theorem is powerful because it tells you that no stopping rule can create an advantage in a fair game. This is why the "doubling" strategy in roulette fails in practice (it requires unbounded wealth and time).

Martingale convergence theorem

(Doob's Martingale Convergence Theorem): If {Xt}\{X_t\} is a martingale (or submartingale) satisfying suptE[Xt]<\sup_t E[|X_t|] < \infty, then XtXX_t \to X_\infty almost surely, where XX_\infty is a finite random variable.

The L1L^1-boundedness condition prevents the process from "escaping to infinity." Note a subtlety: almost sure convergence does not guarantee E[X]=E[X0]E[X_\infty] = E[X_0]. For that, you need the stronger condition of uniform integrability (see below). The proof relies on counting upcrossings of intervals [a,b][a, b] and showing they must be finite.

Uniformly integrable martingales

A martingale {Xt}\{X_t\} is uniformly integrable (UI) if

limKsuptE[Xt1{Xt>K}]=0\lim_{K \to \infty} \sup_t E[|X_t| \, \mathbf{1}_{\{|X_t| > K\}}] = 0

This is stronger than suptE[Xt]<\sup_t E[|X_t|] < \infty. UI martingales enjoy the best convergence properties:

  • XtXX_t \to X_\infty almost surely and in L1L^1.
  • E[X]=E[X0]E[X_\infty] = E[X_0].
  • The optional stopping theorem holds for all stopping times (not just bounded ones): E[Xτ]=E[X0]E[X_\tau] = E[X_0].
  • The process can be "closed": Xt=E[XFt]X_t = E[X_\infty \mid \mathcal{F}_t] for all tt.

A practical sufficient condition: if Xt=E[YFt]X_t = E[Y \mid \mathcal{F}_t] for some integrable random variable YY, then {Xt}\{X_t\} is automatically a UI martingale.

Martingale representation theorem

In continuous time, suppose {Mt}\{M_t\} is a martingale with respect to the natural filtration of a Brownian motion {Wt}\{W_t\}. The martingale representation theorem states there exists a predictable process {Ht}\{H_t\} such that:

Mt=M0+0tHsdWsM_t = M_0 + \int_0^t H_s \, dW_s

Every such martingale can be written as a stochastic integral against Brownian motion. This is a deep structural result: it says Brownian motion is the sole source of randomness in its own filtration. In finance, this theorem guarantees that contingent claims can be replicated by a trading strategy, which is the mathematical foundation of derivative pricing.

Adapted stochastic processes, Stochastic process - Wikipedia

Azuma-Hoeffding inequality

This concentration inequality bounds how far a martingale can stray from its starting point. If {Xt}\{X_t\} is a martingale with bounded differences XkXk1ck|X_k - X_{k-1}| \leq c_k almost surely, then for any ϵ>0\epsilon > 0:

P(XnX0ϵ)2exp ⁣(ϵ22k=1nck2)P(|X_n - X_0| \geq \epsilon) \leq 2 \exp\!\left(-\frac{\epsilon^2}{2 \sum_{k=1}^n c_k^2}\right)

The tail probability decays exponentially in ϵ2\epsilon^2. This is extremely useful in combinatorics, computer science, and statistics for showing that functions of many independent (or weakly dependent) random variables concentrate around their mean.

Martingale transformations

A martingale transformation builds a new process from an existing martingale by reweighting its increments. Think of it as a betting strategy: at each step, you choose how much to wager (based only on past information), and the resulting cumulative gain or loss forms a new martingale.

Martingale transform definition

Given a martingale {Xt}\{X_t\} and a predictable process {Ht}\{H_t\} (meaning HtH_t is Ft1\mathcal{F}_{t-1}-measurable), the martingale transform is:

Yt=k=1tHk(XkXk1)Y_t = \sum_{k=1}^t H_k (X_k - X_{k-1})

Because HkH_k depends only on information up to time k1k-1 and E[XkXk1Fk1]=0E[X_k - X_{k-1} \mid \mathcal{F}_{k-1}] = 0, you can verify that {Yt}\{Y_t\} is again a martingale. The predictability of HtH_t is essential: you must decide your bet before seeing the outcome. If HtH_t could depend on Ft\mathcal{F}_t, the transform would not generally preserve the martingale property.

Discrete vs continuous time

FeatureDiscrete timeContinuous time
TransformYt=k=1tHk(XkXk1)Y_t = \sum_{k=1}^t H_k (X_k - X_{k-1})Yt=0tHsdXsY_t = \int_0^t H_s \, dX_s
PredictabilityHkH_k is Fk1\mathcal{F}_{k-1}-measurableHsH_s is predictable (left-continuous, adapted)
Integrability conditionMild (e.g., bounded HkH_k)E ⁣[0tHs2d[X]s]<E\!\left[\int_0^t H_s^2 \, d[X]_s\right] < \infty

The continuous-time version requires the machinery of stochastic integration (Itô integrals). The discrete case is a finite sum and much more elementary, but the conceptual idea is the same.

Quadratic variation process

The quadratic variation of a martingale tracks accumulated squared increments. In discrete time:

[X]t=k=1t(XkXk1)2[X]_t = \sum_{k=1}^t (X_k - X_{k-1})^2

In continuous time, it's the limit of such sums over increasingly fine partitions:

[X]t=limΠ0i(XtiXti1)2[X]_t = \lim_{|\Pi| \to 0} \sum_{i} (X_{t_i} - X_{t_{i-1}})^2

For standard Brownian motion, [W]t=t[W]_t = t, which is a key fact in stochastic calculus. Quadratic variation appears throughout the theory: in the Itô isometry E ⁣[(0tHsdWs)2]=E ⁣[0tHs2ds]E\!\left[\left(\int_0^t H_s \, dW_s\right)^2\right] = E\!\left[\int_0^t H_s^2 \, ds\right], in Itô's formula, and in characterizing the volatility of martingales.

Applications of martingales

Gambling vs investing

The martingale concept originated from gambling. In a fair casino game, your expected fortune after any number of rounds equals your starting fortune. No betting strategy (martingale transform) can turn a fair game into a favorable one. The famous "doubling strategy" (double your bet after each loss) does produce a profit with high probability, but the rare catastrophic loss exactly offsets the frequent small wins, keeping the expected value unchanged.

In finance, the efficient market hypothesis (in its weak form) asserts that asset prices follow a martingale under the real-world measure: past prices contain no exploitable information. Whether real markets satisfy this is debated, but the martingale framework provides the mathematical baseline for testing it.

Pricing of financial derivatives

The fundamental theorem of asset pricing connects arbitrage-free markets to martingales:

A market is free of arbitrage if and only if there exists an equivalent probability measure Q\mathbb{Q} (the risk-neutral measure) under which discounted asset prices are martingales.

Under Q\mathbb{Q}, the price of a derivative with payoff Φ\Phi at maturity TT is:

V0=EQ ⁣[erTΦ(ST)]V_0 = E^{\mathbb{Q}}\!\left[e^{-rT} \Phi(S_T)\right]

This risk-neutral valuation formula reduces derivative pricing to computing a conditional expectation, which is why martingale theory is so central to quantitative finance.

Modeling of stock prices

Geometric Brownian motion is the standard model for stock prices:

dSt=μStdt+σStdWtdS_t = \mu S_t \, dt + \sigma S_t \, dW_t

The stock price StS_t itself is not a martingale (because of the drift μ\mu), but the discounted price ertSte^{-rt}S_t becomes a martingale under the risk-neutral measure. The Black-Scholes option pricing formula is derived by applying the martingale representation theorem and risk-neutral valuation within this model.

Brownian motion connection

Brownian motion {Wt}\{W_t\} is the prototypical continuous-time martingale. Several related processes are also martingales:

  • WtW_t itself (with respect to its natural filtration)
  • Wt2tW_t^2 - t (connects Brownian motion to quadratic variation)
  • exp(θWt12θ2t)\exp(\theta W_t - \frac{1}{2}\theta^2 t) (the exponential martingale, used heavily in large deviations and change-of-measure arguments)

The martingale representation theorem guarantees that every martingale adapted to the Brownian filtration is a stochastic integral against WtW_t. This makes Brownian motion the universal building block for continuous-path martingales and is the reason Itô calculus is so tightly linked to martingale theory.