A martingale is a stochastic process that models a "fair game": your best prediction of any future value, given everything you know so far, is simply the current value. This concept sits at the heart of modern probability theory and underpins major results in statistics, mathematical finance, and stochastic analysis.

Three conditions must hold for a process $\{X_t\}$ to be a martingale with respect to a filtration $\{\mathcal{F}_t\}$ :

Adapted: $X_t$ is $\mathcal{F}_t$ -measurable for every $t$ .
Integrable: $E[|X_t|] < \infty$ for every $t$ .
Martingale property: $E[X_{t+1} \mid \mathcal{F}_t] = X_t$ for all $t$ .

If condition 3 is replaced with $E[X_{t+1} \mid \mathcal{F}_t] \leq X_t$ , the process is a supermartingale (unfavorable game). If $\geq$ , it's a submartingale (favorable game).

Adapted stochastic processes

A process $\{X_t\}$ is adapted to a filtration $\{\mathcal{F}_t\}$ if $X_t$ is $\mathcal{F}_t$ -measurable for each $t$ . In plain terms, the value $X_t$ is determined entirely by information available at time $t$ . The process cannot "look ahead." This is not just a technicality; without adaptedness, conditioning on $\mathcal{F}_t$ wouldn't make sense, and the martingale property would be ill-defined.

Filtration in martingales

A filtration $\{\mathcal{F}_t\}_{t \geq 0}$ is a nested (increasing) sequence of $\sigma$ -algebras:

$\mathcal{F}_0 \subseteq \mathcal{F}_1 \subseteq \mathcal{F}_2 \subseteq \cdots$

Each $\mathcal{F}_t$ encodes all information available up to time $t$ . As time advances, you accumulate more information, so the $\sigma$ -algebras grow. A martingale is always defined with respect to a particular filtration. The most common choice is the natural filtration $\mathcal{F}_t = \sigma(X_0, X_1, \ldots, X_t)$ , generated by the process itself, but you can use any filtration to which the process is adapted.

Conditional expectation property

The core of the definition: for a martingale $\{X_t\}$ ,

$E[X_{t+1} \mid \mathcal{F}_t] = X_t \quad \text{for all } t$

This says that no matter how much past data you have, your optimal forecast of the next value is the present value. By iterating this property (the tower property of conditional expectation), you get the more general statement:

$E[X_s \mid \mathcal{F}_t] = X_t \quad \text{for all } s > t$

So the "fair game" property extends to any future horizon, not just one step ahead.

Properties of martingales

The martingale definition is simple, but it generates a rich collection of structural results. The properties below give you tools for bounding deviations, computing expectations at random times, and establishing long-run convergence.

Martingale differences

Define the martingale difference sequence by $D_k = X_k - X_{k-1}$ . The martingale property is equivalent to:

$E[D_{k} \mid \mathcal{F}_{k-1}] = 0 \quad \text{for all } k$

Each increment is, on average, zero given the past. This formulation is often more convenient for proofs. A useful consequence: martingale differences are uncorrelated (though not necessarily independent), so $\text{Var}(X_n) = \text{Var}(X_0) + \sum_{k=1}^n \text{Var}(D_k)$ . Variance can only grow or stay constant, never shrink.

Martingale stopping times

A stopping time $\tau$ with respect to $\{\mathcal{F}_t\}$ is a random variable taking values in $\{0, 1, 2, \ldots\} \cup \{\infty\}$ such that the event $\{\tau = t\}$ belongs to $\mathcal{F}_t$ for every $t$ . The decision to stop at time $t$ must depend only on information available at time $t$ , not on the future.

Examples:

The first time a random walk hits level $a$ : $\tau = \inf\{t : X_t = a\}$ .
The first time a process exceeds a threshold: $\tau = \inf\{t : X_t > c\}$ .

"I'll sell when the price next drops" is a valid stopping time. "I'll sell one step before the price peaks" is not, because it requires future knowledge.

Optional stopping theorem

If $\{X_t\}$ is a martingale and $\tau$ is a stopping time, you might hope that $E[X_\tau] = E[X_0]$ . This holds under sufficient conditions, but not in general. The standard version:

If $\tau$ is almost surely bounded (i.e., $\tau \leq N$ a.s. for some constant $N$ ), then $E[X_\tau] = E[X_0]$ .

Weaker conditions also work. For instance, if $E[\tau] < \infty$ and the differences are bounded, the conclusion still holds. The theorem is powerful because it tells you that no stopping rule can create an advantage in a fair game. This is why the "doubling" strategy in roulette fails in practice (it requires unbounded wealth and time).

Martingale convergence theorem

(Doob's Martingale Convergence Theorem): If $\{X_t\}$ is a martingale (or submartingale) satisfying $\sup_t E[|X_t|] < \infty$ , then $X_t \to X_\infty$ almost surely, where $X_\infty$ is a finite random variable.

The $L^1$ -boundedness condition prevents the process from "escaping to infinity." Note a subtlety: almost sure convergence does not guarantee $E[X_\infty] = E[X_0]$ . For that, you need the stronger condition of uniform integrability (see below). The proof relies on counting upcrossings of intervals $[a, b]$ and showing they must be finite.

Uniformly integrable martingales

A martingale $\{X_t\}$ is uniformly integrable (UI) if

$\lim_{K \to \infty} \sup_t E[|X_t| \, \mathbf{1}_{\{|X_t| > K\}}] = 0$

This is stronger than $\sup_t E[|X_t|] < \infty$ . UI martingales enjoy the best convergence properties:

$X_t \to X_\infty$ almost surely and in $L^1$ .
$E[X_\infty] = E[X_0]$ .
The optional stopping theorem holds for all stopping times (not just bounded ones): $E[X_\tau] = E[X_0]$ .
The process can be "closed": $X_t = E[X_\infty \mid \mathcal{F}_t]$ for all $t$ .

A practical sufficient condition: if $X_t = E[Y \mid \mathcal{F}_t]$ for some integrable random variable $Y$ , then $\{X_t\}$ is automatically a UI martingale.

Martingale representation theorem

In continuous time, suppose $\{M_t\}$ is a martingale with respect to the natural filtration of a Brownian motion $\{W_t\}$ . The martingale representation theorem states there exists a predictable process $\{H_t\}$ such that:

$M_t = M_0 + \int_0^t H_s \, dW_s$

Every such martingale can be written as a stochastic integral against Brownian motion. This is a deep structural result: it says Brownian motion is the sole source of randomness in its own filtration. In finance, this theorem guarantees that contingent claims can be replicated by a trading strategy, which is the mathematical foundation of derivative pricing.

Adapted stochastic processes, Stochastic process - Wikipedia

Azuma-Hoeffding inequality

This concentration inequality bounds how far a martingale can stray from its starting point. If $\{X_t\}$ is a martingale with bounded differences $|X_k - X_{k-1}| \leq c_k$ almost surely, then for any $\epsilon > 0$ :

$P(|X_n - X_0| \geq \epsilon) \leq 2 \exp\!\left(-\frac{\epsilon^2}{2 \sum_{k=1}^n c_k^2}\right)$

The tail probability decays exponentially in $\epsilon^2$ . This is extremely useful in combinatorics, computer science, and statistics for showing that functions of many independent (or weakly dependent) random variables concentrate around their mean.

Martingale transformations

A martingale transformation builds a new process from an existing martingale by reweighting its increments. Think of it as a betting strategy: at each step, you choose how much to wager (based only on past information), and the resulting cumulative gain or loss forms a new martingale.

Martingale transform definition

Given a martingale $\{X_t\}$ and a predictable process $\{H_t\}$ (meaning $H_t$ is $\mathcal{F}_{t-1}$ -measurable), the martingale transform is:

$Y_t = \sum_{k=1}^t H_k (X_k - X_{k-1})$

Because $H_k$ depends only on information up to time $k-1$ and $E[X_k - X_{k-1} \mid \mathcal{F}_{k-1}] = 0$ , you can verify that $\{Y_t\}$ is again a martingale. The predictability of $H_t$ is essential: you must decide your bet before seeing the outcome. If $H_t$ could depend on $\mathcal{F}_t$ , the transform would not generally preserve the martingale property.

Discrete vs continuous time

Feature	Discrete time	Continuous time
Transform	$Y_t = \sum_{k=1}^t H_k (X_k - X_{k-1})$	$Y_t = \int_0^t H_s \, dX_s$
Predictability	$H_k$ is $\mathcal{F}_{k-1}$ -measurable	$H_s$ is predictable (left-continuous, adapted)
Integrability condition	Mild (e.g., bounded $H_k$ )	$E\!\left[\int_0^t H_s^2 \, d[X]_s\right] < \infty$

The continuous-time version requires the machinery of stochastic integration (Itô integrals). The discrete case is a finite sum and much more elementary, but the conceptual idea is the same.

Quadratic variation process

The quadratic variation of a martingale tracks accumulated squared increments. In discrete time:

$[X]_t = \sum_{k=1}^t (X_k - X_{k-1})^2$

In continuous time, it's the limit of such sums over increasingly fine partitions:

$[X]_t = \lim_{|\Pi| \to 0} \sum_{i} (X_{t_i} - X_{t_{i-1}})^2$

For standard Brownian motion, $[W]_t = t$ , which is a key fact in stochastic calculus. Quadratic variation appears throughout the theory: in the Itô isometry $E\!\left[\left(\int_0^t H_s \, dW_s\right)^2\right] = E\!\left[\int_0^t H_s^2 \, ds\right]$ , in Itô's formula, and in characterizing the volatility of martingales.

Applications of martingales

Gambling vs investing

The martingale concept originated from gambling. In a fair casino game, your expected fortune after any number of rounds equals your starting fortune. No betting strategy (martingale transform) can turn a fair game into a favorable one. The famous "doubling strategy" (double your bet after each loss) does produce a profit with high probability, but the rare catastrophic loss exactly offsets the frequent small wins, keeping the expected value unchanged.

In finance, the efficient market hypothesis (in its weak form) asserts that asset prices follow a martingale under the real-world measure: past prices contain no exploitable information. Whether real markets satisfy this is debated, but the martingale framework provides the mathematical baseline for testing it.

Pricing of financial derivatives

The fundamental theorem of asset pricing connects arbitrage-free markets to martingales:

A market is free of arbitrage if and only if there exists an equivalent probability measure $\mathbb{Q}$ (the risk-neutral measure) under which discounted asset prices are martingales.

Under $\mathbb{Q}$ , the price of a derivative with payoff $\Phi$ at maturity $T$ is:

$V_0 = E^{\mathbb{Q}}\!\left[e^{-rT} \Phi(S_T)\right]$

This risk-neutral valuation formula reduces derivative pricing to computing a conditional expectation, which is why martingale theory is so central to quantitative finance.

Modeling of stock prices

Geometric Brownian motion is the standard model for stock prices:

$dS_t = \mu S_t \, dt + \sigma S_t \, dW_t$

The stock price $S_t$ itself is not a martingale (because of the drift $\mu$ ), but the discounted price $e^{-rt}S_t$ becomes a martingale under the risk-neutral measure. The Black-Scholes option pricing formula is derived by applying the martingale representation theorem and risk-neutral valuation within this model.

Brownian motion connection

Brownian motion $\{W_t\}$ is the prototypical continuous-time martingale. Several related processes are also martingales:

$W_t$ itself (with respect to its natural filtration)
$W_t^2 - t$ (connects Brownian motion to quadratic variation)
$\exp(\theta W_t - \frac{1}{2}\theta^2 t)$ (the exponential martingale, used heavily in large deviations and change-of-measure arguments)

The martingale representation theorem guarantees that every martingale adapted to the Brownian filtration is a stochastic integral against $W_t$ . This makes Brownian motion the universal building block for continuous-path martingales and is the reason Itô calculus is so tightly linked to martingale theory.