The Central Limit Theorem for Sample Means (Averages)
The Central Limit Theorem (CLT) answers a critical question in statistics: how do sample averages behave when you draw repeated samples from a population? It turns out that no matter what the original population looks like, the distribution of sample means will approximate a normal distribution as the sample size grows.
This matters because it's the foundation of statistical inference. Even if your population data is skewed, bimodal, or oddly shaped, the CLT lets you use normal distribution tools to make conclusions about the population mean based on sample data.
The Central Limit Theorem for Sample Means
The CLT says that if you take many random samples of the same size from a population and calculate the mean of each sample, the distribution of those sample means will be approximately normal, regardless of the population's original shape.
Three conditions need to hold:
- The samples must be independent (one observation doesn't influence another).
- The sample size must be sufficiently large. The common rule of thumb is , though if the population is already roughly normal, smaller samples work fine.
- If sampling without replacement, the sample should be less than 10% of the population.
As sample size increases, three things happen to the sampling distribution of sample means:
- It becomes more symmetric and bell-shaped.
- Its mean equals the population mean:
- Its spread shrinks. The standard deviation of the sampling distribution (called the standard error) decreases by a factor of .
This is what makes the CLT so powerful: you don't need to know the shape of the population distribution to draw conclusions about the population mean.
Standard Error Calculation
The standard error of the mean () measures how much sample means vary from sample to sample. It tells you how tightly your sample means cluster around the true population mean.
The formula is:
where is the population standard deviation and is the sample size.
Notice what happens as gets larger: you're dividing by a bigger number, so the standard error gets smaller. That means larger samples give you more precise estimates of the population mean.
Example: Suppose a population has . With a sample of , the standard error is . If you increase the sample to , the standard error drops to . Quadrupling the sample size cut the standard error in half.
When the population standard deviation is unknown (which is common), you can estimate it using the sample standard deviation :
This estimate works well for large sample sizes.

Z-Scores in Sampling Distributions
A z-score tells you how many standard errors a particular sample mean is from the population mean. It converts your sample mean into a standardized value you can look up on a normal distribution table.
The formula is:
where is the sample mean, is the population mean, and is the standard error.
- A positive z-score means the sample mean is above the population mean.
- A negative z-score means the sample mean is below the population mean.
- The magnitude tells you how far away it is in standard-error units.
Example: A population has and . You take a sample of and get .
-
Calculate the standard error:
-
Calculate the z-score:
-
Interpret: the sample mean of 510 is 2 standard errors above the population mean.
Z-scores in sampling distributions are used to:
- Find the probability of getting a sample mean at least as extreme as the one observed (this connects to p-values later in the course).
- Build confidence intervals for the population mean. For a 95% confidence interval:
Additional Concepts in Sampling Theory
- Law of Large Numbers: As sample size increases, the sample mean converges to the true population mean. This is related to but different from the CLT. The Law of Large Numbers is about a single sample mean getting more accurate; the CLT is about the distribution of many sample means becoming normal.
- Random Variable: A variable whose value is determined by the outcome of a random process. The sample mean is itself a random variable because it changes from sample to sample.
- Sampling Bias: A systematic error in how a sample is selected that leads to a non-representative sample. The CLT assumes your samples are randomly and independently drawn. If there's sampling bias, the theorem's guarantees don't apply.