Fiveable

🔀Stochastic Processes Unit 11 Review

QR code for Stochastic Processes practice questions

11.2 Stochastic differential equations

11.2 Stochastic differential equations

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🔀Stochastic Processes
Unit & Topic Study Guides

Definition of stochastic differential equations

Stochastic differential equations (SDEs) extend ordinary differential equations to systems driven by randomness. They combine a deterministic component describing average behavior with a stochastic component capturing random fluctuations, typically driven by a Wiener process (Brownian motion).

The general form of an SDE is:

dXt=μ(Xt,t)dt+σ(Xt,t)dWtdX_t = \mu(X_t, t)\,dt + \sigma(X_t, t)\,dW_t

  • μ(Xt,t)\mu(X_t, t) is the drift coefficient, governing the deterministic trend of the process
  • σ(Xt,t)\sigma(X_t, t) is the diffusion coefficient, controlling the intensity of random fluctuations
  • WtW_t is a Wiener process, a continuous-time Gaussian process with independent increments and W0=0W_0 = 0

The drift term pulls the process in a predictable direction, while the diffusion term adds noise scaled by σ\sigma. Understanding how these two terms interact is central to everything that follows.

Solutions of stochastic differential equations

Existence of solutions

Existence theorems specify when an SDE actually has a solution. The standard sufficient conditions are:

  • Lipschitz continuity of the coefficients: there exists a constant KK such that μ(x,t)μ(y,t)+σ(x,t)σ(y,t)Kxy|\mu(x,t) - \mu(y,t)| + |\sigma(x,t) - \sigma(y,t)| \leq K|x - y| for all x,yx, y
  • Linear growth condition: μ(x,t)+σ(x,t)K(1+x)|\mu(x,t)| + |\sigma(x,t)| \leq K(1 + |x|)

These conditions prevent the coefficients from blowing up or oscillating too wildly. Existence results can be global (solution defined for all t0t \geq 0) or local (defined only up to some stopping time).

Uniqueness of solutions

Under the same Lipschitz condition, uniqueness holds in two senses:

  • Pathwise (strong) uniqueness: any two solutions built on the same Wiener process agree almost surely for all tt
  • Uniqueness in distribution (weak uniqueness): any two solutions share the same probability law, even if constructed on different probability spaces

Strong uniqueness implies weak uniqueness, but not the other way around.

Explicit solutions vs numerical methods

Closed-form solutions exist only in special cases. The most important example is the linear SDE with constant coefficients, which leads to geometric Brownian motion. For the vast majority of SDEs, you'll need numerical approximation. The general workflow is:

  1. Discretize the time interval [0,T][0, T] into steps of size Δt\Delta t
  2. Simulate increments ΔWnN(0,Δt)\Delta W_n \sim \mathcal{N}(0, \Delta t) of the Wiener process
  3. Update the approximation step by step using a scheme like Euler-Maruyama or Milstein

Itô integral

Definition of Itô integral

The Itô integral extends classical integration to allow integration with respect to a Wiener process. For an adapted process f(t,ω)f(t, \omega), it's defined as:

0Tf(t)dWt=limni=0n1f(ti)(Wti+1Wti)\int_0^T f(t)\,dW_t = \lim_{n \to \infty} \sum_{i=0}^{n-1} f(t_i)\,(W_{t_{i+1}} - W_{t_i})

The crucial feature: the integrand is evaluated at the left endpoint tit_i of each subinterval. This left-endpoint choice is what makes the Itô integral non-anticipating (it only uses information available at the current time).

The Itô integral is a martingale with zero mean: E[0Tf(t)dWt]=0E\left[\int_0^T f(t)\,dW_t\right] = 0.

Properties of Itô integral

  • Linearity: 0T[αf+βg]dWt=α0TfdWt+β0TgdWt\int_0^T [\alpha f + \beta g]\,dW_t = \alpha \int_0^T f\,dW_t + \beta \int_0^T g\,dW_t
  • Adaptedness: the integrand must be adapted to the filtration generated by WtW_t (no peeking into the future)
  • Continuity: the integral is a continuous function of the upper limit TT
  • Quadratic variation: the quadratic variation of 0tfdWs\int_0^t f\,dW_s equals 0tf(s)2ds\int_0^t f(s)^2\,ds, which is generally nonzero

Note: The Itô integral itself has nonzero quadratic variation. This is a key distinction from ordinary integrals and is precisely what gives rise to the extra term in Itô's lemma.

Itô isometry

The Itô isometry connects the second moment of a stochastic integral to a deterministic integral:

E[(0Tf(t)dWt)2]=E[0Tf(t)2dt]E\left[\left(\int_0^T f(t)\,dW_t\right)^2\right] = E\left[\int_0^T f(t)^2\,dt\right]

This identity is indispensable for computing variances and proving convergence results. It essentially lets you move between "stochastic world" and "deterministic world" when working with second moments.

Itô's lemma

Statement of Itô's lemma

Itô's lemma is the stochastic chain rule. If XtX_t satisfies dXt=μdt+σdWtdX_t = \mu\,dt + \sigma\,dW_t and f(t,x)f(t, x) is twice continuously differentiable, then:

df(t,Xt)=(ft+μfx+12σ22fx2)dt+σfxdWtdf(t, X_t) = \left(\frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial x} + \frac{1}{2}\sigma^2 \frac{\partial^2 f}{\partial x^2}\right)dt + \sigma \frac{\partial f}{\partial x}\,dW_t

The term 12σ22fx2\frac{1}{2}\sigma^2 \frac{\partial^2 f}{\partial x^2} has no analogue in ordinary calculus. It arises because the Wiener process has nonzero quadratic variation: dWtdWt=dtdW_t \cdot dW_t = dt. This is the single most important thing to remember about Itô's lemma.

Existence of solutions, differential equations - Picard Theorem globally Lipschitz - Mathematics Stack Exchange

Applications of Itô's lemma

  • Deriving SDE dynamics for transformed processes: if you know dXtdX_t and want d(lnXt)d(\ln X_t) or d(Xt2)d(X_t^2), Itô's lemma gives the answer directly
  • Financial mathematics: the Black-Scholes PDE is derived by applying Itô's lemma to a portfolio of options and stock
  • Computing moments and distributions of stochastic processes
  • Stochastic optimal control: Itô's lemma underpins the Hamilton-Jacobi-Bellman equation

Generalized Itô's lemma

For a function f(t,Xt1,,Xtn)f(t, X_t^1, \ldots, X_t^n) of multiple Itô processes, the generalized formula includes:

  • Partial derivatives with respect to each process and time
  • Cross-variation terms dXtidXtjdX_t^i \cdot dX_t^j, which account for correlations between driving Wiener processes

This is essential for multi-dimensional SDEs and models with multiple sources of uncertainty (e.g., multi-asset option pricing).

Stochastic exponential

Definition of stochastic exponential

The stochastic exponential (Doléans-Dade exponential) of a semimartingale MtM_t is the unique solution to:

dE(M)t=E(M)tdMt,E(M)0=1d\mathcal{E}(M)_t = \mathcal{E}(M)_t\,dM_t, \quad \mathcal{E}(M)_0 = 1

For a continuous process MtM_t, the explicit form is:

E(M)t=exp(Mt12Mt)\mathcal{E}(M)_t = \exp\left(M_t - \frac{1}{2}\langle M \rangle_t\right)

where Mt\langle M \rangle_t is the quadratic variation of MM. The subtraction of 12Mt\frac{1}{2}\langle M \rangle_t is a direct consequence of Itô's lemma applied to the exponential function.

Properties of stochastic exponential

  • Strictly positive: E(M)t>0\mathcal{E}(M)_t > 0 for all tt, since it's an exponential
  • Multiplicative: E(M)E(N)=E(M+N+M,N)\mathcal{E}(M) \cdot \mathcal{E}(N) = \mathcal{E}(M + N + \langle M, N \rangle)
  • Inverse: 1/E(M)1/\mathcal{E}(M) is itself a stochastic exponential
  • Local martingale: E(M)t\mathcal{E}(M)_t is always a local martingale; it's a true martingale if Novikov's condition E[exp(12MT)]<E\left[\exp\left(\frac{1}{2}\langle M \rangle_T\right)\right] < \infty holds
  • Reduces to the ordinary exponential eate^{at} when the stochastic component vanishes

Relationship to geometric Brownian motion

Geometric Brownian motion (GBM) is the most common SDE in finance:

dSt=μStdt+σStdWtdS_t = \mu S_t\,dt + \sigma S_t\,dW_t

Its solution is the stochastic exponential:

St=S0exp((μσ22)t+σWt)S_t = S_0 \exp\left(\left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W_t\right)

The σ22-\frac{\sigma^2}{2} correction ensures E[St]=S0eμtE[S_t] = S_0 e^{\mu t}, which you can verify using Itô's lemma. GBM stays strictly positive, making it suitable for asset price modeling. The Black-Scholes option pricing model is built on this foundation.

Stratonovich integral

Definition of Stratonovich integral

The Stratonovich integral uses a midpoint rule instead of the left-endpoint rule:

0Tf(t)dWt=limni=0n1f(ti)+f(ti+1)2(Wti+1Wti)\int_0^T f(t) \circ dW_t = \lim_{n \to \infty} \sum_{i=0}^{n-1} \frac{f(t_i) + f(t_{i+1})}{2}(W_{t_{i+1}} - W_{t_i})

The circle notation dWt\circ\,dW_t distinguishes it from the Itô integral. Because of the midpoint evaluation, the Stratonovich integral obeys the ordinary chain rule without the extra second-derivative correction term.

Comparison to Itô integral

FeatureItôStratonovich
Evaluation pointLeft endpointMidpoint
Chain ruleModified (extra 12σ2f\frac{1}{2}\sigma^2 f'' term)Ordinary chain rule
Martingale propertyYes (integral is a martingale)Not in general
Non-anticipatingYesNo (uses future values in construction)
The Stratonovich integral is not non-anticipating, which makes it less natural for filtering and prediction problems. However, it's often preferred in physics because physical systems modeled as limits of smooth noise naturally yield Stratonovich SDEs.

Stratonovich calculus

To convert between the two frameworks, use the Itô-Stratonovich correction:

0Tf(Xt)dWt=0Tf(Xt)dWt+120Tf(Xt)σ(Xt)dt\int_0^T f(X_t) \circ dW_t = \int_0^T f(X_t)\,dW_t + \frac{1}{2}\int_0^T f'(X_t)\sigma(X_t)\,dt

This means any Stratonovich SDE can be rewritten as an Itô SDE (and vice versa) by adding or subtracting the correction term. The choice between the two depends on your application: Itô is standard in finance and probability theory; Stratonovich is common in physics and engineering.

Existence of solutions, differential equations - Picard theorem for functions which are locally lipschitz - Mathematics ...

Linear stochastic differential equations

Homogeneous linear equations

A homogeneous linear SDE has the form:

dXt=a(t)Xtdt+b(t)XtdWtdX_t = a(t)X_t\,dt + b(t)X_t\,dW_t

The solution is given by the stochastic exponential:

Xt=X0exp(0t(a(s)12b(s)2)ds+0tb(s)dWs)X_t = X_0 \exp\left(\int_0^t \left(a(s) - \frac{1}{2}b(s)^2\right)ds + \int_0^t b(s)\,dW_s\right)

This generalizes the deterministic exponential solution x(t)=x0eadsx(t) = x_0 e^{\int a\,ds} to the stochastic setting. These equations model phenomena like population dynamics with random growth rates.

Inhomogeneous linear equations

Inhomogeneous linear SDEs add an external forcing term:

dXt=[a(t)Xt+f(t)]dt+[b(t)Xt+g(t)]dWtdX_t = [a(t)X_t + f(t)]\,dt + [b(t)X_t + g(t)]\,dW_t

The solution combines a homogeneous solution with a particular solution found via variation of parameters.

Variation of parameters formula

This is the stochastic analogue of the classical ODE technique:

  1. Solve the corresponding homogeneous equation to get the fundamental solution Φt\Phi_t
  2. Write the particular solution as Xt=Φt0tΦs1[f(s)ds+g(s)dWs]X_t = \Phi_t \int_0^t \Phi_s^{-1}[f(s)\,ds + g(s)\,dW_s]
  3. The full solution is the sum of the homogeneous solution (from initial conditions) and this particular integral

The formula reduces the inhomogeneous problem to computing stochastic integrals, which can then be handled analytically or numerically.

Numerical methods for SDEs

Euler-Maruyama method

The simplest and most widely used scheme. Given dXt=μ(Xt)dt+σ(Xt)dWtdX_t = \mu(X_t)\,dt + \sigma(X_t)\,dW_t:

Xn+1=Xn+μ(Xn)Δt+σ(Xn)ΔWnX_{n+1} = X_n + \mu(X_n)\Delta t + \sigma(X_n)\Delta W_n

where ΔWnN(0,Δt)\Delta W_n \sim \mathcal{N}(0, \Delta t).

  • Strong convergence order: 0.5 (error in individual paths scales as (Δt)0.5(\Delta t)^{0.5})
  • Weak convergence order: 1.0 (error in expectations scales as Δt\Delta t)

Strong convergence matters when you care about pathwise accuracy. Weak convergence matters when you only need accurate expectations (e.g., option pricing).

Milstein method

The Milstein method adds a correction term to Euler-Maruyama:

Xn+1=Xn+μ(Xn)Δt+σ(Xn)ΔWn+12σ(Xn)σ(Xn)[(ΔWn)2Δt]X_{n+1} = X_n + \mu(X_n)\Delta t + \sigma(X_n)\Delta W_n + \frac{1}{2}\sigma(X_n)\sigma'(X_n)\left[(\Delta W_n)^2 - \Delta t\right]

This achieves strong convergence order 1.0, a significant improvement. The extra term involves σ\sigma', the derivative of the diffusion coefficient. In multiple dimensions, implementing Milstein requires simulating iterated stochastic integrals (Lévy areas), which adds complexity.

Runge-Kutta methods for SDEs

Stochastic Runge-Kutta methods adapt the classical ODE approach by using multiple evaluations of μ\mu and σ\sigma within each time step. Higher-order methods (e.g., order 1.5) provide better accuracy but at greater computational cost. They're most useful when high precision is needed and the SDE has smooth coefficients.

Applications of stochastic differential equations

Financial mathematics

  • The Black-Scholes model uses GBM (dS=μSdt+σSdWdS = \mu S\,dt + \sigma S\,dW) to price European options
  • Stochastic volatility models like the Heston model couple an SDE for the asset price with a separate SDE for the variance: dvt=κ(θvt)dt+ξvtdWtvdv_t = \kappa(\theta - v_t)\,dt + \xi\sqrt{v_t}\,dW_t^v
  • Interest rate models (Vasicek, Cox-Ingersoll-Ross) use mean-reverting SDEs to capture the dynamics of short rates

Physics and engineering

  • The Langevin equation mdv=γvdt+σdWtm\,dv = -\gamma v\,dt + \sigma\,dW_t describes Brownian particle motion in a fluid, balancing friction against thermal noise
  • In control engineering, SDEs model systems with stochastic disturbances and form the basis for robust controller design
  • Stochastic PDEs (extensions of SDEs to spatial domains) appear in turbulence modeling and quantum field theory

Biology and ecology

  • Stochastic Lotka-Volterra equations add noise to predator-prey dynamics, capturing environmental variability and demographic stochasticity
  • Epidemic models (stochastic SIR) use SDEs to account for randomness in disease transmission
  • In neuroscience, SDEs model the stochastic firing patterns of neurons, where membrane potential fluctuates due to random synaptic inputs