📈Theoretical Statistics Unit 3 – Expectation and moments
Expectation and moments are fundamental concepts in probability theory and statistics. They provide powerful tools for analyzing random variables and their distributions, allowing us to quantify average values, spread, and other important characteristics.
From basic definitions to advanced applications, this topic covers a wide range of ideas. We'll explore probability foundations, random variables, moment generating functions, and their roles in statistical inference, giving you a solid understanding of these essential concepts.
Expectation represents the average value of a random variable over its entire range of possible outcomes
Moments measure different aspects of a probability distribution, such as central tendency, dispersion, and shape
First moment is the mean or expected value, denoted as E[X] for a random variable X
Second moment is the expected value of the squared random variable, E[X2], related to the variance
Variance measures the spread of a distribution around its mean, defined as Var(X)=E[(X−E[X])2]
Higher moments (third, fourth, etc.) capture additional characteristics of a distribution, such as skewness and kurtosis
Moment generating functions (MGFs) are a tool for generating moments of a random variable through differentiation
MGFs uniquely characterize a probability distribution and can be used to derive its properties
Probability Foundations
Probability is a measure of the likelihood of an event occurring, ranging from 0 (impossible) to 1 (certain)
Sample space Ω is the set of all possible outcomes of a random experiment
Events are subsets of the sample space, and the probability of an event A is denoted as P(A)
Probability axioms: non-negativity (P(A)≥0), normalization (P(Ω)=1), and countable additivity (P(⋃i=1∞Ai)=∑i=1∞P(Ai) for disjoint events Ai)
Conditional probability P(A∣B) is the probability of event A given that event B has occurred, defined as P(A∣B)=P(B)P(A∩B) when P(B)>0
Independence of events: Two events A and B are independent if P(A∩B)=P(A)P(B), meaning the occurrence of one does not affect the probability of the other
Random Variables and Distributions
A random variable is a function that assigns a numerical value to each outcome in a sample space
Discrete random variables have countable outcomes (integers), while continuous random variables have uncountable outcomes (real numbers)
Probability mass function (PMF) for a discrete random variable X is denoted as pX(x)=P(X=x), giving the probability of X taking a specific value x
Probability density function (PDF) for a continuous random variable X is denoted as fX(x), satisfying P(a≤X≤b)=∫abfX(x)dx
Cumulative distribution function (CDF) FX(x)=P(X≤x) gives the probability of a random variable being less than or equal to a given value x
For discrete random variables, FX(x)=∑y≤xpX(y)
For continuous random variables, FX(x)=∫−∞xfX(y)dy
Common discrete distributions include Bernoulli, Binomial, Poisson, and Geometric
Common continuous distributions include Uniform, Normal (Gaussian), Exponential, and Beta
Expectation: Basics and Properties
Expectation is a linear operator, meaning E[aX+bY]=aE[X]+bE[Y] for constants a and b and random variables X and Y
For a discrete random variable X with PMF pX(x), the expectation is calculated as E[X]=∑xx⋅pX(x)
For a continuous random variable X with PDF fX(x), the expectation is calculated as E[X]=∫−∞∞x⋅fX(x)dx
Law of the unconscious statistician (LOTUS): For a function g(X) of a random variable X, E[g(X)]=∑xg(x)⋅pX(x) (discrete case) or E[g(X)]=∫−∞∞g(x)⋅fX(x)dx (continuous case)
Expectation of a constant: E[c]=c for any constant c
Expectation of a sum: E[X+Y]=E[X]+E[Y] for random variables X and Y
Expectation of a product: E[XY]=E[X]⋅E[Y] for independent random variables X and Y
Moments and Their Significance
Raw moments: The k-th raw moment of a random variable X is defined as E[Xk]
First raw moment is the mean, E[X]
Second raw moment is E[X2], used to calculate variance
Central moments: The k-th central moment of a random variable X is defined as E[(X−E[X])k]
First central moment is always 0
Second central moment is the variance, Var(X)=E[(X−E[X])2]
Standardized moments: The k-th standardized moment of a random variable X is defined as E[(Var(X)X−E[X])k]
Third standardized moment measures skewness, the asymmetry of a distribution
Fourth standardized moment measures kurtosis, the heaviness of the tails of a distribution
Moments can be used to characterize and compare different probability distributions
Higher moments provide additional information about the shape and properties of a distribution
Moment Generating Functions
The moment generating function (MGF) of a random variable X is defined as MX(t)=E[etX], where t is a real number
MGFs uniquely determine a probability distribution, meaning two random variables with the same MGF have the same distribution
The k-th moment of X can be found by differentiating the MGF k times and evaluating at t=0: E[Xk]=MX(k)(0)
MGFs can be used to derive the mean, variance, and other properties of a distribution
For independent random variables X and Y, the MGF of their sum is the product of their individual MGFs: MX+Y(t)=MX(t)⋅MY(t)
MGFs can be used to prove various results in probability theory, such as the Central Limit Theorem
Applications in Statistical Inference
Moments and MGFs play a crucial role in parameter estimation and hypothesis testing
Method of moments estimators are obtained by equating sample moments to population moments and solving for the parameters
For example, the sample mean Xˉ is an estimator for the population mean μ
Maximum likelihood estimation (MLE) is another common approach, which finds the parameter values that maximize the likelihood function
MGFs can be used to derive the sampling distributions of estimators and test statistics
Moments and MGFs are also used in Bayesian inference to specify prior and posterior distributions for parameters
Higher moments, such as skewness and kurtosis, can be used to assess the normality assumption in various statistical tests
Advanced Topics and Extensions
Multivariate moments and MGFs extend the concepts to random vectors and joint distributions
Conditional expectation E[X∣Y] is the expected value of X given the value of another random variable Y
Moment inequalities, such as Markov's inequality and Chebyshev's inequality, provide bounds on the probability of a random variable deviating from its mean
Characteristic functions, defined as ϕX(t)=E[eitX] for real t, are another tool for uniquely characterizing distributions
Cumulants are an alternative to moments, with the k-th cumulant defined as the k-th derivative of the logarithm of the MGF evaluated at 0
Empirical moments and MGFs can be used to estimate population moments and MGFs from sample data
Robust moments, such as trimmed means and winsorized means, are less sensitive to outliers and heavy-tailed distributions