Moment generating functions (MGFs) are one of the most elegant tools in probability theory because they transform complex integration problems into straightforward differentiation. You're being tested on your ability to use MGFs to extract moments, identify distributions, and analyze sums of independent random variables—skills that connect directly to the Central Limit Theorem and statistical inference. When an exam asks you to prove two random variables have the same distribution or to find the variance of a sum, MGFs are often your fastest path to the answer.
The power of MGFs lies in their uniqueness property and their multiplicative behavior for independent variables. These aren't just abstract mathematical facts—they're the foundation for understanding why normal distributions appear everywhere and how we can characterize complex random phenomena from simple building blocks. Don't just memorize the formulas; know why each property matters and when to deploy MGFs versus other techniques.
Definition and Core Mechanics
The MGF encodes all information about a distribution into a single function. By expanding etX as a power series, the MGF packages every moment into its Taylor coefficients.
Definition of the MGF
MX(t)=E[etX]—the expected value of the exponential function of the random variable, defined for values of t where this expectation exists
Existence requirement: the MGF must be finite in some open interval (−h,h) around t=0; distributions with heavy tails may fail this condition
Power series connection: since etX=∑n=0∞n!(tX)n, the MGF naturally "generates" moments through its derivatives
Extracting Moments via Differentiation
E[Xn]=MX(n)(0)—the n-th moment equals the n-th derivative of the MGF evaluated at t=0
First derivative gives the mean: MX′(0)=E[X], providing a direct route to expected value without integration
Systematic computation: higher moments (variance, skewness, kurtosis) follow the same pattern, making MGFs far more efficient than repeated integration
Compare: Direct integration vs. MGF differentiation—both yield moments, but MGFs reduce the problem to calculus on a known function. If an FRQ gives you an MGF and asks for variance, differentiate twice and use Var(X)=MX′′(0)−[MX′(0)]2.
Uniqueness and Distribution Identification
The MGF serves as a "fingerprint" for probability distributions. If two random variables share the same MGF in a neighborhood of zero, they must have identical distributions.
The Uniqueness Theorem
Same MGF implies same distribution—this is the theoretical foundation for using MGFs to identify unknown distributions
Interval requirement: equality must hold in some open interval around t=0, not just at isolated points
Proof strategy: when asked to show two random variables are identically distributed, compute both MGFs and demonstrate they match
MGFs of Common Distributions
Normal N(μ,σ2): MX(t)=eμt+2σ2t2—the quadratic term in the exponent is the signature of normality
Exponential with rate λ: MX(t)=λ−tλ for t<λ—note the restricted domain due to the distribution's tail behavior
Poisson with parameter λ: MX(t)=eλ(et−1)—the nested exponential structure reflects the discrete, count-based nature
Compare: Normal vs. Exponential MGFs—the normal MGF exists for all real t, while the exponential MGF has a restricted domain t<λ. This reflects the exponential's heavier right tail. Exam questions often test whether you recognize domain restrictions.
Sums of Independent Random Variables
One of the most powerful applications of MGFs is analyzing sums. The MGF converts convolution (a difficult integral operation) into simple multiplication.
The Multiplication Property
MX+Y(t)=MX(t)⋅MY(t) for independent X and Y—this follows directly from E[et(X+Y)]=E[etX]E[etY] when variables are independent
Extends to n variables: for independent X1,…,Xn, the MGF of the sum is the product of individual MGFs
Distribution identification: multiply MGFs, then use uniqueness to identify what distribution the sum follows
Central Limit Theorem Connection
Standardized sums converge: the MGF approach provides one of the cleanest proofs of the CLT by showing MGFs converge to the normal MGF
Practical application: when summing many independent variables, the product of MGFs often simplifies to a recognizable form
Exam relevance: problems asking "what is the distribution of X1+X2+⋯+Xn?" are prime MGF territory
Compare: Sum of independent normals vs. sum of independent Poissons—both yield closed-form MGFs after multiplication. For normals, N(μ1,σ12)+N(μ2,σ22)=N(μ1+μ2,σ12+σ22). For Poissons, Pois(λ1)+Pois(λ2)=Pois(λ1+λ2). Know these results cold.
Limitations and Alternatives
MGFs don't always exist, so you need to know when to switch tools. Characteristic functions offer a universally applicable alternative at the cost of working with complex numbers.
When MGFs Fail
Heavy-tailed distributions: the Cauchy distribution has no MGF because E[etX] diverges for all t=0
Existence condition: E[etX]<∞ must hold for some interval around zero; this fails when tails decay slower than exponentially
Diagnostic value: if you compute an MGF and get divergence, this tells you something important about the distribution's tail behavior
Characteristic Functions as the Universal Alternative
Definition: φX(t)=E[eitX] where i=−1—always exists because ∣eitX∣=1
MGF as special case: when the MGF exists, MX(t)=φX(−it), connecting the two approaches
Same uniqueness property: characteristic functions also uniquely determine distributions, making them the go-to tool when MGFs don't exist
Compare: MGF vs. Characteristic Function—MGFs are easier to work with (real numbers, straightforward differentiation) but don't always exist. Characteristic functions always exist but require complex analysis. On exams, use MGFs when possible; mention characteristic functions when asked about heavy-tailed cases.
Quick Reference Table
Concept
Best Examples
MGF definition and existence
MX(t)=E[etX], existence near t=0
Moment extraction
E[Xn]=MX(n)(0), mean from first derivative
Uniqueness theorem
Same MGF ⇒ same distribution
Common MGFs
Normal, Exponential, Poisson
Independence and sums
MX+Y(t)=MX(t)⋅MY(t)
CLT connection
Product of MGFs converges to normal MGF
Existence failures
Cauchy, other heavy-tailed distributions
Characteristic functions
φX(t)=E[eitX], always exists
Self-Check Questions
If MX(t)=e3t+2t2, what distribution does X follow, and what are its mean and variance?
You're given two independent random variables with MGFs MX(t)=2−t2 and MY(t)=2−t2. What is the MGF of X+Y, and can you identify the resulting distribution?
Compare and contrast the conditions under which MGFs and characteristic functions exist. Why might you prefer one over the other in a given problem?
Explain why the MGF of the Cauchy distribution does not exist, and describe what alternative approach you would use to analyze sums of Cauchy random variables.
An FRQ asks you to prove that the sum of n independent Pois(λ) random variables is Pois(nλ). Outline your MGF-based proof in three steps.