Probability density functions (PDFs) are how you model uncertainty and predict outcomes when results aren't deterministic. Mastering PDFs means being able to recognize which distribution fits which scenario, understand how parameters shape behavior, and apply the right PDF to problems involving reliability, quality control, hypothesis testing, and more.
Don't just memorize the formulas. Know why each distribution exists, what real-world processes it models, and how changing parameters affects the shape. When you see a problem describing waiting times, failure rates, or sample statistics, you should immediately recognize which distribution family applies.
Foundational Continuous Distributions
These distributions form the building blocks of probability theory. They model idealized scenarios and serve as the basis for more complex distributions.
Uniform Distribution
Equal probability across a bounded interval: every value between a and b is equally likely, with PDF f(x)=bโa1โ for aโคxโคb
Two parameters define the support: minimum a and maximum b, giving mean 2a+bโ and variance 12(bโa)2โ
Foundation for random number generation: transforming uniform samples into other distributions is a core simulation technique (this is called the inverse transform method)
Normal (Gaussian) Distribution
Bell-shaped and symmetric around the mean ฮผ: the PDF is f(x)=ฯ2ฯโ1โeโ2ฯ2(xโฮผ)2โ
Defined by mean ฮผ and standard deviation ฯ, where approximately 68% of values fall within ยฑ1ฯ, 95% within ยฑ2ฯ, and 99.7% within ยฑ3ฯ of the mean
The Central Limit Theorem makes this universal: sums (and averages) of independent random variables converge to normal as the sample size grows, which explains why measurement errors and natural phenomena so often follow this distribution
Compare: Uniform vs. Normal: both are symmetric, but uniform has bounded support with constant density while normal has unbounded support with density concentrated near the mean. If a problem describes "equally likely outcomes in a range," use uniform; for "accumulated random effects," use normal.
Time-to-Event and Reliability Distributions
These distributions model when something happens: failure times, arrival processes, and system lifetimes. A key concept here is the hazard rate, which represents the instantaneous probability of an event occurring given that it hasn't occurred yet.
Exponential Distribution
Models time until a single event with constant hazard rate ฮป. The "memoryless" property means the probability of the event occurring in the next interval doesn't depend on how much time has already passed. Mathematically: P(X>s+tโฃX>s)=P(X>t).
Single parameter ฮป (rate) gives mean ฮป1โ and PDF f(x)=ฮปeโฮปx for xโฅ0
Fundamental to queuing theory and reliability: use this when the failure rate doesn't change with age (think electronic components, not mechanical wear)
Weibull Distribution
Generalizes exponential to handle varying failure rates: the shape parameter k determines whether hazard increases (k>1), decreases (k<1), or stays constant (k=1, which reduces to exponential)
Two parameters: shape k and scale ฮป, with PDF f(x)=ฮปkโ(ฮปxโ)kโ1eโ(x/ฮป)k for xโฅ0
Industry standard for reliability engineering: models infant mortality (k<1), random failures (k=1), and wear-out (k>1) in a single framework
Gamma Distribution
Models waiting time for multiple events: if exponential gives the time to one event, gamma gives the time to the k-th event
Two parameters: shape k and scale ฮธ (or equivalently rate ฮฒ=1/ฮธ), with mean kฮธ and variance kฮธ2
Connects to other distributions: it reduces to exponential when k=1, and to chi-square when ฮธ=2 and k=ฮฝ/2. These connections come up often in problems.
Compare: Exponential vs. Weibull: exponential assumes a constant failure rate (memoryless), while Weibull allows the failure rate to change with time. On reliability problems, ask yourself: "Does age affect failure probability?" If yes, use Weibull.
Bounded and Proportion Distributions
When your random variable is constrained to a specific interval, these distributions apply. They're essential for modeling probabilities, percentages, and ratios.
Beta Distribution
Defined only on [0,1]: perfect for modeling probabilities, proportions, and Bayesian prior distributions
Two shape parameters ฮฑ and ฮฒ control asymmetry: ฮฑ>ฮฒ skews right (density concentrated toward 1), ฮฑ<ฮฒ skews left (density concentrated toward 0), and ฮฑ=ฮฒ is symmetric
Extremely flexible: it becomes uniform when ฮฑ=ฮฒ=1, U-shaped when ฮฑ,ฮฒ<1, or unimodal when ฮฑ,ฮฒ>1. It's also the conjugate prior for binomial likelihood in Bayesian inference, which makes updating beliefs mathematically clean.
Lognormal Distribution
Models positive-only variables where multiplicative effects dominate: if ln(X) is normally distributed, then X is lognormal
Parameters ฮผ and ฯ are the mean and standard deviation of ln(X), not of X itself. This is a common source of mistakes on problems. The actual mean of X is eฮผ+ฯ2/2.
Right-skewed with a heavy tail: models income distributions, stock prices, particle sizes, and any quantity that grows by percentages rather than by fixed amounts
Compare: Beta vs. Lognormal: both can be right-skewed, but beta is bounded on [0,1] while lognormal is unbounded above. Use beta for proportions (market share, defect rates); use lognormal for positive quantities with multiplicative growth.
Sampling and Inference Distributions
These distributions arise from sampling processes and are essential for hypothesis testing, confidence intervals, and ANOVA. They're all derived from normal distributions.
Chi-Square Distribution
Sum of squared standard normal variables: if ZiโโผN(0,1), then โi=1kโZi2โโผฯk2โ
Single parameter: degrees of freedom k, with mean k and variance 2k. The distribution is right-skewed for small k and approaches normal as kโโ.
Primary use is variance testing: the sample variance s2 follows a scaled chi-square distribution, specifically ฯ2(nโ1)s2โโผฯnโ12โ. This makes it essential for confidence intervals on ฯ2 and goodness-of-fit tests.
Student's t-Distribution
Ratio of a standard normal to the square root of a chi-square divided by its df: this arises naturally when you estimate a population mean but have to use the sample standard deviation instead of the true ฯ
Degrees of freedom ฮฝ control tail heaviness: smaller ฮฝ means heavier tails (more probability in the extremes), and as ฮฝโโ the t-distribution approaches the standard normal
Critical for small-sample inference: use the t-distribution instead of the normal when ฯ is unknown and you're relying on s. The common rule of thumb is that this matters most when n<30, though technically you should use t whenever ฯ is unknown.
F-Distribution
Ratio of two independent chi-square variables, each divided by its df: used to compare two variances or mean squares in ANOVA
Two parameters: d1โ (numerator df) and d2โ (denominator df): order matters, so Fd1โ,d2โโ๎ =Fd2โ,d1โโ
Right-skewed and positive-only: the test statistic answers "is the variance ratio significantly different from 1?"
Compare: Chi-square vs. t vs. F: all three derive from normal samples. Chi-square tests one variance, t tests one mean (with unknown variance), and F tests two variances or multiple means (via ANOVA). Know which degrees of freedom formula applies to each test type.
Quick Reference Table
Concept
Best Distribution
Bounded, equal probability
Uniform
Symmetric, sum of random effects
Normal
Time to single event (constant hazard)
Exponential
Time to multiple events
Gamma
Variable failure rates over time
Weibull
Proportions and probabilities on [0,1]
Beta
Positive-only, multiplicative growth
Lognormal
Variance testing
Chi-square
Mean testing (small samples, unknown ฯ)
Student's t
Comparing variances / ANOVA
F-distribution
Self-Check Questions
Which two distributions are memoryless, and what does this property mean mathematically? (Hint: one is continuous, one is discrete.)
You're modeling the proportion of defective items in a batch (values between 0 and 1). Which distribution is most appropriate, and what parameters would you adjust to reflect a prior belief that defect rates are typically low?
Compare and contrast the chi-square and F-distributions: how are they mathematically related, and when would you use each in hypothesis testing?
A reliability engineer observes that component failure rates increase with age due to wear. Which distribution should they use, and what constraint on the shape parameter reflects this behavior?
You have sample data and need to construct a confidence interval for the population mean with unknown variance. Which distribution do you use for the critical value, and how does your answer change as sample size grows large?
Key Concepts of Probability Density Functions to Know for Intro to Probability