Central Limit Theorem (Pocket Change)
The Central Limit Theorem (CLT) explains why we can use normal distribution techniques on data that isn't normally distributed. It states that the sampling distribution of sample means will be approximately normal, regardless of the shape of the population distribution, as long as the sample size is large enough (typically ). The pocket change example makes this concrete: the amount of change people carry is almost certainly skewed, but if you repeatedly sample groups of people and plot the means of those samples, you get a bell curve anyway.

Simulating the Distribution of Sample Means
Pocket change data works well here because the population distribution is clearly non-normal. Most people carry small amounts of change (lots of values near zero), with a few people carrying much more, creating a right-skewed distribution.
To simulate the CLT with pocket change data:
- Define your population (e.g., the amount of change carried by all students at your school).
- Draw a large number of random samples (100 or more), each of a fixed size (say ).
- Calculate the mean pocket change for each sample.
- Plot all those sample means in a histogram.
Even though the original pocket change distribution is skewed right, the histogram of sample means will look approximately normal and centered on the true population mean. That's the CLT at work.
Note: the guide's original text mentioned "the law of large numbers" here. That's a related but different concept. The law of large numbers says a single sample mean gets closer to the population mean as that sample gets larger. The CLT is specifically about the shape of the sampling distribution becoming normal across many samples.

Sample Size Effects on the Sampling Distribution
Shape: As sample size increases, the sampling distribution of the mean becomes more normal. With small samples (say ), the distribution of sample means may still reflect the skewness of the population. By or so, the distribution is usually close to normal. The more skewed the population, the larger needs to be.
Center: The mean of the sampling distribution equals the population mean , regardless of sample size. Whether you're sampling 5 people or 50, the average of all your sample means will land on .
Spread: The standard deviation of the sampling distribution is called the standard error, calculated as:
As increases, the standard error decreases, so the sampling distribution gets narrower. This means larger samples produce sample means that cluster more tightly around the true population mean. In practical terms, a sample of 100 people's pocket change will give you a much more reliable estimate of the population mean than a sample of 10.

Probability Calculations with the Central Limit Theorem
Because the CLT tells you the sampling distribution is approximately normal, you can use z-scores to find probabilities about sample means. Here's the process:
- Identify the population parameters: the population mean and population standard deviation of the pocket change data.
- Calculate the standard error: , where is your sample size.
- Compute the z-score for your sample mean:
where is the sample mean you're interested in.
- Use the z-table (or calculator) to find the probability associated with that z-score.
For example, if the population mean pocket change is with , and you take a sample of , the standard error is . To find the probability that a sample mean exceeds $1.00, you'd calculate , then look up to get the corresponding probability.
Statistical Inference and the CLT
Statistical inference means drawing conclusions about a population based on sample data. The CLT is what makes most inference techniques work, because it justifies treating the sampling distribution as normal.
Once you know the sampling distribution is approximately normal, you can:
- Construct confidence intervals for the population mean:
where is the critical value for your chosen confidence level (e.g., 1.96 for 95% confidence). If is unknown and estimated by the sample standard deviation , you use a -score instead of .
- Conduct hypothesis tests about the population mean using the test statistic:
where is the hypothesized population mean.
A random variable is a variable whose value depends on the outcome of a random process. The amount of pocket change a randomly selected person carries is a random variable, and its probability distribution describes how likely each possible value is. The CLT connects individual random variables to the predictable behavior of sample means, which is why it's so central to the rest of the course.