Fiveable

🎲Intro to Probability Unit 6 Review

QR code for Intro to Probability practice questions

6.2 Probability density functions

6.2 Probability density functions

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🎲Intro to Probability
Unit & Topic Study Guides

Probability density functions (PDFs) are the main tool for working with continuous random variables. Unlike discrete distributions where you can assign a probability to each individual outcome, continuous variables require a different approach: you use the area under a curve to find probabilities. PDFs define that curve.

This section covers how PDFs work, their key properties, how to use them to calculate probabilities, and the most common distributions you'll encounter.

Probability Density Functions

Fundamentals of PDFs

A PDF describes how probability is spread across the possible values of a continuous random variable. The critical idea: the PDF's value at a single point is not a probability. Instead, probability comes from the area under the PDF curve over an interval.

Two requirements define a valid PDF:

  • Non-negativity: f(x)0f(x) \geq 0 for all xx
  • Total area equals 1: f(x)dx=1\int_{-\infty}^{\infty} f(x) \, dx = 1

The cumulative distribution function (CDF), written F(x)F(x), gives the probability that the random variable is less than or equal to xx. You get it by integrating the PDF from -\infty up to xx:

F(x)=xf(t)dtF(x) = \int_{-\infty}^{x} f(t) \, dt

Going the other direction, the PDF is the derivative of the CDF: f(x)=F(x)f(x) = F'(x).

Characteristics and Interpretation

The shape of a PDF tells you a lot about the distribution:

  • Central tendency: Where the peak(s) sit. A PDF can be unimodal (one peak, like the normal distribution), bimodal (two peaks), or multimodal.
  • Spread: How wide the curve is. A narrow PDF means the variable clusters tightly around the center; a wide PDF means more variability.
  • Skewness: Whether the distribution is symmetric or lopsided. A right-skewed PDF has a longer tail stretching to the right (meaning occasional large values), while a left-skewed PDF tails off to the left.

Calculating Probabilities with PDFs

Fundamentals of PDFs, Probability density function - Wikipedia

Integration and Probability Calculation

To find the probability that a continuous random variable XX falls in the interval [a,b][a, b], integrate the PDF over that interval:

P(aXb)=abf(x)dxP(a \leq X \leq b) = \int_a^b f(x) \, dx

Equivalently, you can use the CDF: P(aXb)=F(b)F(a)P(a \leq X \leq b) = F(b) - F(a).

One consequence that trips people up: the probability that XX equals any exact value is zero. That's because a single point has no width, so the area under the curve is zero. This means P(X=a)=0P(X = a) = 0, and it also means P(aXb)=P(a<X<b)P(a \leq X \leq b) = P(a < X < b). Whether you include the endpoints doesn't matter.

Expected value and variance are calculated by integrating against the PDF:

  • Expected value (mean): E[X]=xf(x)dxE[X] = \int_{-\infty}^{\infty} x \cdot f(x) \, dx
  • Variance: Var(X)=(xE[X])2f(x)dx\text{Var}(X) = \int_{-\infty}^{\infty} (x - E[X])^2 \cdot f(x) \, dx

The variance formula can also be computed as E[X2](E[X])2E[X^2] - (E[X])^2, which is often easier to work with.

Advanced Techniques

  • Change of variables: If you know the PDF of XX and want the PDF of Y=g(X)Y = g(X), you can use transformation techniques to derive it.
  • Numerical integration: When you can't solve an integral in closed form, methods like Simpson's rule or the trapezoidal rule give approximate answers.
  • Moment-generating functions (MGFs): The MGF MX(t)=E[etX]M_X(t) = E[e^{tX}] provides a compact way to find moments and to identify the distribution of sums of independent random variables.

Common Probability Density Functions

Fundamentals of PDFs, Continuous Probability Distribution (2 of 2) | Concepts in Statistics

Uniform and Normal Distributions

The uniform distribution on [a,b][a, b] assigns equal likelihood to every value in the interval. Its PDF is constant:

f(x)=1bafor axbf(x) = \frac{1}{b - a} \quad \text{for } a \leq x \leq b

Think of a random number generator that picks any value between 0 and 1 with no preference.

The normal (Gaussian) distribution is the classic bell curve, defined by its mean μ\mu and standard deviation σ\sigma:

f(x)=1σ2πe(xμ)22σ2f(x) = \frac{1}{\sigma\sqrt{2\pi}} \, e^{-\frac{(x - \mu)^2}{2\sigma^2}}

It shows up constantly: heights in a population, measurement errors, test scores. The standard normal distribution is the special case where μ=0\mu = 0 and σ=1\sigma = 1, and you convert any normal variable to it using Z=XμσZ = \frac{X - \mu}{\sigma}.

Other Important Distributions

  • Exponential distribution: Models the time between events in a Poisson process, with rate parameter λ\lambda. Example: time between customer arrivals at a store. Its PDF is f(x)=λeλxf(x) = \lambda e^{-\lambda x} for x0x \geq 0.
  • Gamma distribution: Generalizes the exponential. Useful for modeling waiting times for multiple events or positive continuous quantities like rainfall amounts.
  • Beta distribution: Defined on the interval [0,1][0, 1], making it natural for modeling proportions and probabilities, such as success rates.
  • Weibull distribution: Common in reliability engineering for modeling product lifetimes. It generalizes the exponential by allowing the failure rate to increase or decrease over time.
  • Lognormal distribution: If ln(X)\ln(X) is normally distributed, then XX is lognormal. Often used in finance for stock prices and in economics for income distributions.

Applying PDFs to Continuous Variables

Transformation and Convolution Techniques

  • Method of transformations: Derive the PDF of Y=g(X)Y = g(X) when you know the distribution of XX. For example, if XX is standard normal, you can find the PDF of Y=X2Y = X^2 (which turns out to be a chi-square distribution with 1 degree of freedom).
  • Convolution: To find the PDF of the sum of two independent continuous random variables, you convolve their PDFs. For instance, the sum of two independent normal random variables is itself normal.
  • Multivariate change of variables: Extends the transformation method to multiple dimensions, such as converting from Cartesian to polar coordinates.

Advanced Applications

  • Central limit theorem (CLT): The sum (or average) of a large number of independent random variables is approximately normal, regardless of the original distribution. This is why the normal distribution appears so often in practice.
  • Order statistics: These describe the distribution of the minimum, maximum, or any ranked value from a sample. Useful for analyzing extreme values.
  • Kernel density estimation: A nonparametric technique for estimating an unknown PDF from observed data. Instead of assuming a specific distribution, you build a smooth curve from the data points themselves.