Topic 8.1 sets up Unit 8 by asking a simple question: when the counts you observe differ from the counts you expected, is that gap just random chance or a real effect? Learning to frame that question is the first step toward chi-square tests, which you will use to decide whether observed categorical data fits a model or shows a real association.
Why This Matters for the AP Statistics Exam
Unit 8 is worth about 2 to 5 percent of the exam, and every chi-square test starts with the idea introduced here: comparing observed counts to expected counts. Before you can calculate a chi-square statistic or interpret a p-value, you need to recognize when a difference between what you found and what you expected is worth investigating. This thinking shows up in multiple-choice questions about categorical data and in free-response questions where you set up and justify a test.
The big idea is that variation between observed and expected counts can come from random chance or from a real pattern. Your job in later topics is to use evidence to decide which explanation is more reasonable.

Key Takeaways
- Differences between observed and expected counts can be caused by random variation or by a real effect, and statistics helps you tell them apart.
- A small p-value suggests the observed difference is unlikely to be just chance; a large p-value means chance is a reasonable explanation.
- Larger sample sizes make results more stable because variability shrinks as sample size grows.
- The law of large numbers explains why sample proportions get closer to the true probability as you collect more data.
- A noticeable gap in a small sample may not mean much, but the same gap in a large sample is harder to explain by chance.
- This topic is about asking the right question, not yet running calculations.
Random Chance or a Real Effect?
Anytime you collect categorical data, the counts you get will rarely match your expectations exactly. Some difference is normal. The challenge is deciding whether the gap between observed and expected counts is just random variation or a sign that your original assumption was wrong.
When you run a statistical test later in this unit, the p-value measures how likely your observed difference would be if nothing unusual were going on. A low p-value (often below 0.05) means the difference is unlikely to be pure chance, which points to a real effect. A high p-value means chance is a believable explanation, so you do not have strong evidence of a real difference.
Example: Flipping a Coin
If you flip a fair coin 10 times, you expect about 5 heads and 5 tails. You will not always get exactly that. Flipping is random, so a range of outcomes is normal.
Getting 4 heads and 6 tails would not make you doubt the coin. That result is close to what you expected. But getting 10 heads and 0 tails is a much bigger gap, and that would make you question whether the coin is really fair.
This is the core question of Topic 8.1: is the difference small enough to be random, or big enough to suggest something real?
How Sample Size Changes the Picture
Sample size matters a lot. Getting 4 heads and 6 tails in 10 flips is no big deal. But getting 400 heads and 600 tails in 1000 flips is much more surprising, even though both are a 40/60 split.
The reason is that the standard deviation of a statistic decreases as sample size increases. This inverse relationship between sample size and variability shows up across every inference procedure in the course, so it is worth getting comfortable with now.
Power in Statistical Tests
Sample size also affects the power of a test, which is the probability of correctly detecting a real difference when one exists. In general, a larger sample size gives a test more power.
One reason is that a larger sample leads to a smaller standard deviation of the test statistic. The test becomes more precise, so it is better at catching a true difference between what you observe and what you expect.
Law of Large Numbers
The law of large numbers says that as sample size increases, the sample mean (or sample proportion) gets closer to the true population value. With more data, your sample is a better estimate of the population.
For coin flips, this means the proportion of heads moves toward 0.5 as you flip more times. A 40/60 split after 10 flips is normal. A 40/60 split after 1000 flips is suspicious, because by then the proportion should be very close to the true probability. If you flip 1000 times and get 500 heads, you would feel confident the coin is fair.
This principle is the foundation for making reliable conclusions about a population from a sample.
How to Use This on the AP Statistics Exam
MCQ
Expect questions that ask you to recognize when a gap between observed and expected counts is meaningful. Watch for sample size: the same percentage split can be unremarkable in a small sample and surprising in a large one.
Free Response
Later free-response questions will ask you to set up and justify a chi-square test. The reasoning starts here: you are deciding whether observed differences from expected counts could plausibly be random. When you write conclusions, connect your decision to the p-value and the context of the problem.
Common Trap
Do not jump to calculations before framing the question. Topic 8.1 is about identifying that a difference exists and asking whether it is random or real. That framing guides every step that follows.
Common Misconceptions
- A difference between observed and expected counts does not automatically mean something is wrong. Some variation is always expected by chance.
- A large p-value does not prove there is "no effect" or "no association." It only means you lack strong evidence of one. Avoid language that accepts the null hypothesis as true.
- A larger sample does not make your results "more correct." It makes your estimates more stable and your test more able to detect a real difference.
- The same proportion split is not equally meaningful at every sample size. A 40/60 result means much more in 1000 trials than in 10.
- A small p-value shows statistical significance, but that is not the same as practical importance. Significant and important are different ideas.
Related AP Statistics Guides
Vocabulary
The following words are mentioned explicitly in the College Board Course and Exam Description for this topic.Term | Definition |
|---|---|
categorical data | Data that represents categories or groups rather than numerical measurements, such as colors, types, or classifications. |
expected count | The theoretical frequency in each cell of a contingency table that would be expected if the null hypothesis of independence or homogeneity were true. |
observed count | The actual frequency or number of observations in each cell of a contingency table from the collected data. |
variation | Differences in data that occur by chance due to the random nature of sampling, rather than from systematic causes. |
Frequently Asked Questions
What does observed vs expected mean in statistics?
Observed counts are what your data actually show. Expected counts are what you would predict if a null model or assumption were true.
What is a p-value in hypothesis testing?
A p-value is the probability of getting results at least as unusual as the observed results, assuming the null hypothesis is true. Smaller p-values are stronger evidence against the null.
How do you know if results are unexpected?
Results are unexpected when the gap between observed and expected counts is too large to be easily explained by random variation under the null model.
How does sample size affect whether a result is surprising?
With larger samples, random variation tends to shrink, so the same percentage difference can be more convincing in a large sample than in a small sample.
How does Topic 8.1 connect to chi-square tests?
Topic 8.1 introduces the reasoning behind chi-square tests: compare observed counts to expected counts and decide whether the differences are plausible by chance.
What is the common mistake in interpreting a large p-value?
A large p-value does not prove the null hypothesis is true. It means the observed difference is reasonably consistent with random variation under the null.