Fiveable

๐Ÿ”€Stochastic Processes Unit 2 Review

QR code for Stochastic Processes practice questions

2.5 Transformations of random variables

2.5 Transformations of random variables

Written by the Fiveable Content Team โ€ข Last updated August 2025
Written by the Fiveable Content Team โ€ข Last updated August 2025
๐Ÿ”€Stochastic Processes
Unit & Topic Study Guides

Transformations of random variables

Transforming a random variable means applying a function to it, producing a new random variable with a different distribution. This is one of the core skills in stochastic processes: if you know the distribution of XX, and you define Y=g(X)Y = g(X), you need to figure out the distribution of YY. The main techniques for doing this are the CDF method, the MGF method, and for sums, convolution.

Functions of Random Variables

Discrete vs. continuous functions

The approach you take depends on whether you're working with discrete or continuous random variables.

  • Discrete case: If XX takes values in a countable set, then Y=g(X)Y = g(X) is also discrete. You find its PMF by collecting all input values that map to the same output and summing their probabilities.
  • Continuous case: If XX has a PDF, then Y=g(X)Y = g(X) is typically continuous (though not always). You'll use the CDF technique or the change-of-variables formula to find the PDF of YY.

Probability distribution of functions

For discrete random variables, the PMF of Y=g(X)Y = g(X) is:

P(Y=y)=โˆ‘x:โ€‰g(x)=yP(X=x)P(Y = y) = \sum_{x:\, g(x) = y} P(X = x)

You're grouping together every xx value that lands on the same yy, then adding up their probabilities.

For continuous random variables, you generally can't just "plug in" to the PDF. Instead, you work through the CDF first (described next) or use the MGF approach.

Cumulative Distribution Function Technique

Deriving CDFs from transformations

The CDF method is the most general approach and works for virtually any transformation. Here's the procedure:

  1. Start with Y=g(X)Y = g(X) and write the CDF definition: FY(y)=P(Yโ‰คy)=P(g(X)โ‰คy)F_Y(y) = P(Y \leq y) = P(g(X) \leq y).
  2. Manipulate the inequality g(X)โ‰คyg(X) \leq y to isolate XX. For example, if gg is strictly increasing, this becomes P(Xโ‰คgโˆ’1(y))P(X \leq g^{-1}(y)).
  3. Express the result in terms of FXF_X, the CDF of XX.

If gg is strictly decreasing, the inequality flips: P(g(X)โ‰คy)=P(Xโ‰ฅgโˆ’1(y))=1โˆ’FX(gโˆ’1(y))P(g(X) \leq y) = P(X \geq g^{-1}(y)) = 1 - F_X(g^{-1}(y)).

When gg is not monotone (e.g., Y=X2Y = X^2), you need to split into cases and account for all regions of XX that satisfy g(X)โ‰คyg(X) \leq y.

Inverting CDFs to find distributions

Once you have FY(y)F_Y(y), differentiate with respect to yy to get the PDF:

fY(y)=ddyFY(y)f_Y(y) = \frac{d}{dy} F_Y(y)

For a monotone, differentiable transformation Y=g(X)Y = g(X) with inverse X=gโˆ’1(Y)X = g^{-1}(Y), this yields the change-of-variables formula:

fY(y)=fX(gโˆ’1(y))โ‹…โˆฃddygโˆ’1(y)โˆฃf_Y(y) = f_X(g^{-1}(y)) \cdot \left| \frac{d}{dy} g^{-1}(y) \right|

The absolute value accounts for both increasing and decreasing transformations. This single formula handles most one-variable continuous problems you'll encounter.

Moment-Generating Function Technique

Uniqueness of moment-generating functions

The MGF of a random variable XX is MX(t)=E[etX]M_X(t) = E[e^{tX}], defined for tt in some neighborhood of zero. The key property: if two random variables have the same MGF in a neighborhood of zero, they have the same distribution. This uniqueness theorem is what makes the MGF method work.

Finding distributions using MGFs

The strategy is to compute the MGF of the transformed variable and then recognize it as belonging to a known distribution family.

  1. Define Y=g(X)Y = g(X) and write MY(t)=E[etY]=E[etโ€‰g(X)]M_Y(t) = E[e^{tY}] = E[e^{t\,g(X)}].
  2. Evaluate this expectation using the distribution of XX.
  3. If the resulting expression matches the MGF of a known distribution (normal, gamma, Poisson, etc.), you've identified the distribution of YY.

Example: If XโˆผN(ฮผ,ฯƒ2)X \sim N(\mu, \sigma^2) and Y=aX+bY = aX + b, then MY(t)=ebtMX(at)=ebteaฮผt+a2ฯƒ2t2/2=e(aฮผ+b)t+a2ฯƒ2t2/2M_Y(t) = e^{bt} M_X(at) = e^{bt} e^{a\mu t + a^2\sigma^2 t^2/2} = e^{(a\mu+b)t + a^2\sigma^2 t^2/2}. This is the MGF of N(aฮผ+b,โ€‰a2ฯƒ2)N(a\mu + b,\, a^2\sigma^2), confirming that a linear transformation of a normal is still normal.

The MGF method is especially powerful for sums of independent random variables, since MX+Y(t)=MX(t)โ‹…MY(t)M_{X+Y}(t) = M_X(t) \cdot M_Y(t) when XX and YY are independent.

Convolutions of Independent Random Variables

Sums of independent random variables

When XX and YY are independent and you want the distribution of Z=X+YZ = X + Y, the result is a convolution.

  • Continuous case: fZ(z)=โˆซโˆ’โˆžโˆžfX(x)โ€‰fY(zโˆ’x)โ€‰dxf_Z(z) = \int_{-\infty}^{\infty} f_X(x)\, f_Y(z - x)\, dx
  • Discrete case: P(Z=z)=โˆ‘xP(X=x)โ€‰P(Y=zโˆ’x)P(Z = z) = \sum_{x} P(X = x)\, P(Y = z - x)

You're summing (or integrating) over all ways the two variables can combine to give the total zz. In practice, the MGF method is often faster for sums: compute MZ(t)=MX(t)โ‹…MY(t)M_Z(t) = M_X(t) \cdot M_Y(t) and recognize the result.

Products of independent random variables

For the product W=XYW = XY of two independent continuous random variables, the PDF can be derived using a change of variables. One standard approach:

  1. Define W=XYW = XY and V=XV = X (an auxiliary variable).
  2. Compute the joint PDF of (W,V)(W, V) using the Jacobian.
  3. Integrate out VV to get the marginal PDF of WW.

Note: unlike sums, the MGF of a product is not simply the product of the individual MGFs. The factoring property MX+Y(t)=MX(t)MY(t)M_{X+Y}(t) = M_X(t)M_Y(t) applies only to sums.

Discrete vs continuous functions, Stochastic-Process-11 | EyEular

Transformations of Multiple Random Variables

Joint cumulative distribution functions

For a vector of random variables (X1,X2,โ€ฆ,Xn)(X_1, X_2, \ldots, X_n), the joint CDF is:

FX1,โ€ฆ,Xn(x1,โ€ฆ,xn)=P(X1โ‰คx1,X2โ‰คx2,โ€ฆ,Xnโ‰คxn)F_{X_1, \ldots, X_n}(x_1, \ldots, x_n) = P(X_1 \leq x_1, X_2 \leq x_2, \ldots, X_n \leq x_n)

When you apply a transformation (Y1,โ€ฆ,Yn)=g(X1,โ€ฆ,Xn)(Y_1, \ldots, Y_n) = \mathbf{g}(X_1, \ldots, X_n), you can use the multivariate CDF method: express events about the YiY_i in terms of the XiX_i and use the joint distribution of X\mathbf{X}.

Jacobian matrix for transformations

For an invertible transformation of continuous random variables, the multivariate change-of-variables formula is:

fY1,โ€ฆ,Yn(y1,โ€ฆ,yn)=fX1,โ€ฆ,Xn(x1,โ€ฆ,xn)โ‹…โˆฃdetโก(J)โˆฃโˆ’1f_{Y_1, \ldots, Y_n}(y_1, \ldots, y_n) = f_{X_1, \ldots, X_n}(x_1, \ldots, x_n) \cdot \left| \det(J) \right|^{-1}

where JJ is the Jacobian matrix with entries Jij=โˆ‚yiโˆ‚xjJ_{ij} = \frac{\partial y_i}{\partial x_j}, and (x1,โ€ฆ,xn)(x_1, \ldots, x_n) is expressed in terms of (y1,โ€ฆ,yn)(y_1, \ldots, y_n) via the inverse transformation.

Equivalently, if you write the inverse transformation and define the Jacobian of the inverse, you get โˆฃdetโก(Jโˆ’1)โˆฃ|\det(J^{-1})| directly. Either way, the determinant corrects for how the transformation stretches or compresses volume in probability space.

Common Transformations and Distributions

Linear transformations

For Y=aX+bY = aX + b (with aโ‰ 0a \neq 0):

  • E[Y]=aE[X]+bE[Y] = aE[X] + b
  • Var(Y)=a2Var(X)\text{Var}(Y) = a^2 \text{Var}(X)
  • The PDF transforms as: fY(y)=1โˆฃaโˆฃfXโ€‰โฃ(yโˆ’ba)f_Y(y) = \frac{1}{|a|} f_X\!\left(\frac{y - b}{a}\right)

Linear transformations preserve distribution families in many cases. Normals stay normal, and Cauchy random variables stay Cauchy, for instance.

Exponential and logarithmic transformations

Exponential: If Y=eXY = e^X, apply the CDF method. Since eXe^X is strictly increasing:

fY(y)=fX(lnโกy)โ‹…1y,y>0f_Y(y) = f_X(\ln y) \cdot \frac{1}{y}, \quad y > 0

A classic application: if XโˆผN(ฮผ,ฯƒ2)X \sim N(\mu, \sigma^2), then Y=eXY = e^X follows a lognormal distribution.

Logarithmic: If Y=lnโกXY = \ln X for X>0X > 0, then:

fY(y)=fX(ey)โ‹…eyf_Y(y) = f_X(e^y) \cdot e^y

These transformations are useful for converting multiplicative relationships into additive ones.

Normal to standard normal transformation

Any normal random variable XโˆผN(ฮผ,ฯƒ2)X \sim N(\mu, \sigma^2) can be standardized:

Z=Xโˆ’ฮผฯƒZ = \frac{X - \mu}{\sigma}

This gives ZโˆผN(0,1)Z \sim N(0, 1). The transformation lets you use standard normal tables or software to compute probabilities for any normal distribution. It's a special case of the linear transformation with a=1/ฯƒa = 1/\sigma and b=โˆ’ฮผ/ฯƒb = -\mu/\sigma.

Chi-square and gamma distributions

If Z1,Z2,โ€ฆ,ZnZ_1, Z_2, \ldots, Z_n are independent N(0,1)N(0,1) variables, then:

Y=โˆ‘i=1nZi2โˆผฯ‡2(n)Y = \sum_{i=1}^{n} Z_i^2 \sim \chi^2(n)

The chi-square distribution with nn degrees of freedom is actually a special case of the gamma distribution: ฯ‡2(n)=Gamma(n/2,โ€‰1/2)\chi^2(n) = \text{Gamma}(n/2,\, 1/2) (using the rate parameterization).

More generally, the gamma family is closed under summation of independent variables: if XiโˆผGamma(ฮฑi,ฮฒ)X_i \sim \text{Gamma}(\alpha_i, \beta) are independent with the same rate ฮฒ\beta, then โˆ‘XiโˆผGamma(โˆ‘ฮฑi,ฮฒ)\sum X_i \sim \text{Gamma}(\sum \alpha_i, \beta). This is easy to verify using MGFs.

Applications of Transformations

Signal processing and filtering

In signal processing, random signals pass through systems (filters) that transform their distributions. If the input to a linear time-invariant system is a random process, the output distribution depends on the system's transfer function. Fourier and Laplace transforms are used to move between time and frequency domains, simplifying the analysis of how noise and signals interact.

Reliability analysis and failure rates

Reliability engineering models component lifetimes as random variables. The exponential distribution models constant failure rates (memoryless property), while the Weibull distribution handles increasing or decreasing failure rates. A logarithmic transformation of Weibull data linearizes the survival function, making it easier to estimate parameters from observed failure data.

Stochastic modeling in physics and engineering

Transformations underpin many physical models. Brownian motion (particle diffusion) involves Gaussian random variables whose distributions evolve over time. Birth-death processes use transformations to derive steady-state distributions. In each case, knowing how to transform distributions lets you move from a simple model to the quantities you actually care about, like hitting times, equilibrium concentrations, or system reliability.