Fiveable

🎲Intro to Probability Unit 14 Review

QR code for Intro to Probability practice questions

14.2 Central limit theorem

14.2 Central limit theorem

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🎲Intro to Probability
Unit & Topic Study Guides

The Central Limit Theorem is a game-changer in probability and statistics. It tells us that when we take lots of samples from any distribution, their averages tend to follow a normal distribution. This magical property helps us make predictions and draw conclusions about populations.

Understanding the CLT is crucial for grasping how sample means behave. It's the foundation for many statistical techniques, from confidence intervals to hypothesis testing. Knowing when and how to apply it can make complex data analysis feel like a breeze.

The Central Limit Theorem

Fundamental Principles and Importance

  • Central limit theorem (CLT) describes behavior of sample means for large sample sizes
  • Distribution of sample means approximates normal distribution as sample size increases
    • Occurs regardless of underlying population distribution
  • Applies to sum of random variables and their average
  • Bridges properties of individual random variables with behavior of aggregates
  • Enables statistical inference and hypothesis testing
  • Convergence to normality speed varies
    • Faster for bell-shaped populations
    • Slower for highly skewed distributions (requires larger sample sizes)
  • Crucial for constructing confidence intervals and performing statistical tests
    • Used in various real-world applications (finance, quality control, social sciences)

Mathematical Representation and Properties

  • CLT mathematically expressed as (Xˉnμ)/(σ/n)N(0,1)(X̄ₙ - μ) / (σ / √n) → N(0,1) as nn → ∞
    • XˉnX̄ₙ represents sample mean of n observations
    • μμ represents population mean
    • σσ represents population standard deviation
  • Standardized sample mean converges to standard normal distribution
  • Holds when mean and variance of original population exist and are finite
  • Approximation often considered sufficient when sample size n30n ≥ 30
    • Can vary based on underlying distribution characteristics
  • Rate of convergence to normality depends on original distribution
    • Distributions closer to normal converge faster

Central Limit Theorem for IID Variables

IID Assumption and Its Implications

  • Applies to sequence of independent and identically distributed (i.i.d.) random variables
  • Independence requirement means value of one variable does not influence others
  • Identical distribution implies shared probability distribution and parameters
  • Violations of i.i.d. assumption can affect theorem applicability
    • Examples: time series data, clustered observations
  • Understanding i.i.d. assumption crucial for proper application of CLT
    • Helps identify situations where modifications or alternative approaches needed

Convergence and Sample Size Considerations

  • CLT holds regardless of original population distribution shape
  • Requires finite mean μμ and variance σ2σ²
  • Practical applications often use sample size n30n ≥ 30 as rule of thumb
    • Not a strict threshold, varies based on underlying distribution
  • Larger sample sizes needed for highly skewed or heavy-tailed distributions
    • Examples: exponential distribution, Pareto distribution
  • Rate of convergence influenced by original distribution characteristics
    • Distributions closer to normal converge faster (normal, uniform)
    • Highly skewed distributions converge slower (chi-squared with low degrees of freedom)
Fundamental Principles and Importance, Teorema del límite central - Central limit theorem - xcv.wiki

Applying the Central Limit Theorem

Approximating Sampling Distributions

  • CLT allows approximation of sampling distribution of mean using normal distribution
  • For large sample sizes, sample mean Xˉ approximately normally distributed
    • Mean: μμ (population mean)
    • Standard deviation: σ/nσ / √n (standard error of the mean)
  • Enables probability calculations related to sample means
    • Uses standard normal distribution tables or z-score calculations
  • Important to distinguish between standard error of mean (σ/nσ / √n) and population standard deviation (σσ)
  • Applicable even when population distribution non-normal
    • Examples: binomial distribution for large n, Poisson distribution for large λ

Statistical Inference and Hypothesis Testing

  • CLT used to construct confidence intervals for population means
    • Formula: Xˉ±z(α/2)(σ/n)X̄ ± z_(α/2) * (σ / √n), where z(α/2)z_(α/2) is the critical value
  • Enables hypothesis tests about population parameters
    • Examples: t-tests, z-tests for means
  • When population standard deviation unknown, sample standard deviation used as estimate
    • Particularly effective for large sample sizes
  • Facilitates comparison of sample means from different populations
    • Used in ANOVA, regression analysis
  • Allows for approximation of other sampling distributions
    • Examples: sampling distribution of proportions, differences between means

Conditions for Central Limit Theorem

Sample Size and Distribution Characteristics

  • Primary condition sufficiently large sample size, typically n30n ≥ 30
    • Not a strict cutoff, depends on underlying distribution
  • Larger sample sizes required for highly skewed or heavy-tailed distributions
    • Examples: lognormal distribution, Cauchy distribution
  • Population must have finite mean and variance for CLT to apply
    • Excludes certain distributions (Cauchy distribution)
  • CLT approximation accuracy improves with increasing sample size
    • Particularly important for distributions far from normal

Independence and Sampling Considerations

  • Random variables must be independent
    • Value of one variable should not influence others in sample
  • Random variables should be identically distributed
    • Share same probability distribution and parameters
  • CLT may require modification for dependent random variables
    • Examples: time series data, spatial data
  • May not hold or need adjustment when sampling without replacement from finite population
    • Particularly important when sample size is large relative to population size
  • Understanding these conditions crucial for determining CLT applicability
    • Helps recognize potential limitations in statistical analyses
    • Guides choice of alternative methods when conditions not met (bootstrapping, permutation tests)
Pep mascot
Upgrade your Fiveable account to print any study guide

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Click below to go to billing portal → update your plan → choose Yearly → and select "Fiveable Share Plan". Only pay the difference

Plan is open to all students, teachers, parents, etc
Pep mascot
Upgrade your Fiveable account to export vocabulary

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Plan is open to all students, teachers, parents, etc
report an error
description

screenshots help us find and fix the issue faster (optional)

add screenshot

2,589 studying →