Fiveable

🎲Intro to Statistics Unit 7 Review

QR code for Intro to Statistics practice questions

7.4 Central Limit Theorem (Pocket Change)

7.4 Central Limit Theorem (Pocket Change)

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🎲Intro to Statistics
Unit & Topic Study Guides

The Central Limit Theorem (CLT) explains how sample means behave when you repeatedly sample from any population. Even if the original data is skewed or oddly shaped, the distribution of sample means will approach a normal curve as you take more and more samples.

The "pocket change" simulation makes this visible. By collecting coins, recording their mint years, and repeatedly sampling from that collection, you can watch a normal distribution emerge from non-normal data. This section covers how to run that simulation and what the results tell you about shape, center, spread, and sample size.

Central Limit Theorem (Pocket Change)

Central Limit Theorem Simulation

The pocket change activity walks you through the CLT using real data. Here's how it works:

Step 1: Collect population data Gather a large number of coins (pennies, nickels, dimes, quarters) and record the year minted on each one. This full set of mint years is your population. It probably won't look normal since certain years tend to show up more than others.

Step 2: Draw repeated samples

  • Randomly select a fixed number of coins (say, 5) and calculate the mean mint year for that sample.
  • Put the coins back (sampling with replacement) and repeat.
  • Do this many times, like 1,000 rounds, so you end up with 1,000 sample means.

Step 3: Plot the sampling distribution Create a histogram of those 1,000 sample means. This histogram is your sampling distribution of the sample mean.

Step 4: Compare to the population Plot a histogram of the original population data (all the mint years) next to the sampling distribution. Notice the differences in shape, center, and spread. The population histogram might be skewed or lumpy, but the sampling distribution should look much closer to a bell curve.

Central Limit Theorem simulation, Sampling Distributions | Boundless Statistics

Interpretation of Sampling Distributions

Once you've built the sampling distribution, there are four things to examine:

  • Shape
    • As the number of samples increases, the sampling distribution approaches a normal (bell-shaped) curve, regardless of the population's shape. The population could be skewed, uniform, or bimodal, and the sampling distribution still trends toward normal.
    • This is the core idea of the CLT: normality emerges from the sampling process itself, not from the original data.
  • Center
    • The mean of the sampling distribution equals the population mean. This means sample means are unbiased estimators of the population mean. If the true average mint year of all your coins is 2003, the average of your 1,000 sample means will be very close to 2003.
  • Spread
    • The standard deviation of the sampling distribution is called the standard error, and it equals:

Standard Error=σn\text{Standard Error} = \frac{\sigma}{\sqrt{n}}

  • Here, σ\sigma is the population standard deviation and nn is the sample size. As nn increases, the standard error shrinks, meaning your sample means cluster more tightly around the true population mean. Larger samples give more precise estimates.
  • Sample size and normality
    • Larger samples produce a more normal-looking sampling distribution. A common rule of thumb is that n30n \geq 30 is usually enough for the sampling distribution to be approximately normal, even when the population is clearly non-normal.
    • If the population is already close to normal, smaller sample sizes work fine. If the population is heavily skewed, you may need a larger nn.
Central Limit Theorem simulation, A Simulation Showing the Role of Central Limit Theorem in Handling Non-Normal Distributions

Sampling Methods and Population Considerations

  • Sampling with replacement: Each coin is returned before the next draw, so the population stays the same size throughout. This is what the simulation uses, and it keeps each draw independent.
  • Sampling without replacement: Coins aren't returned after selection. This changes the probabilities slightly for each subsequent draw because the pool shrinks.
  • Finite population correction: When you sample without replacement and your sample is a large fraction of the population (roughly more than 5-10%), you apply a correction factor to reduce the standard error. For this intro-level simulation, sampling with replacement avoids this issue.
  • Law of large numbers: As sample size grows, the sample mean gets closer and closer to the true population mean. This is a separate idea from the CLT but reinforces why larger samples give better estimates.

Inferences Using Sample Means

The CLT is what makes many common inference methods work. Because the sampling distribution is approximately normal, you can use it to build confidence intervals and run hypothesis tests.

Confidence intervals

A confidence interval gives a range of plausible values for the population mean. The formula is:

xˉ±zσn\bar{x} \pm z^* \cdot \frac{\sigma}{\sqrt{n}}

  • xˉ\bar{x} = sample mean
  • zz^* = critical value from the standard normal distribution (for example, 1.96 for 95% confidence)
  • σ\sigma = population standard deviation (use the sample standard deviation ss if σ\sigma is unknown)
  • nn = sample size

For instance, if your sample of 25 coins has a mean mint year of 2005, σ=10\sigma = 10, and you want 95% confidence, the interval is 2005±1.961025=2005±3.922005 \pm 1.96 \cdot \frac{10}{\sqrt{25}} = 2005 \pm 3.92, giving you roughly (2001.08, 2008.92).

Hypothesis testing

You can also test a claim about the population mean using a z-test. The test statistic is:

z=xˉμ0σ/nz = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}

  • μ0\mu_0 = the hypothesized population mean
  • The other terms are the same as above

This z-score tells you how many standard errors the sample mean falls from the hypothesized value. You then compare it to a critical value or calculate a p-value to decide whether to reject or fail to reject the null hypothesis, typically at a significance level of 0.05.