Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Statistical errors are at the heart of what makes hypothesis testing both powerful and risky. When you conduct a significance test, you're making a decision under uncertainty—and that means you can be wrong in predictable ways. The AP Statistics exam tests your ability to distinguish between Type I errors (false positives) and Type II errors (false negatives), understand what factors influence their probabilities, and explain the real-world consequences of each in context. You'll also need to recognize how sampling variability, bias, and confounding can undermine the validity of statistical conclusions.
Don't just memorize that "Type I = rejecting a true null"—understand why we set significance levels, how sample size affects power, and what trade-offs researchers face when designing studies. The exam loves asking you to identify which error is more serious in a given context or to explain what would happen if you changed α. Master the underlying logic, and you'll be ready for both multiple-choice questions and FRQs that ask you to interpret errors in real scenarios.
When you perform a hypothesis test, you're making a binary decision: reject or fail to reject . Since you never know the true state of reality, either decision could be wrong.
Compare: Type I vs. Type II errors—both involve incorrect conclusions, but Type I means seeing something that isn't there while Type II means missing something that is. On FRQs, always connect the error type to the specific context (e.g., "concluding the new drug works when it doesn't" for Type I).
The probabilities of Type I and Type II errors aren't fixed—they depend on choices you make and characteristics of your study.
Compare: Increasing vs. increasing —both increase power, but increasing also increases Type I error risk while increasing reduces uncertainty without that trade-off. If an FRQ asks how to increase power without raising false positive risk, sample size is your answer.
These errors occur before you even run a hypothesis test—they threaten the validity of your entire study.
Compare: Sampling error vs. selection bias—sampling error is random variation that decreases with larger , while selection bias is systematic distortion that persists regardless of sample size. The AP exam frequently tests whether students understand this distinction.
Even with good data and correct calculations, you can draw wrong conclusions about what the results mean.
Compare: Confounding vs. Simpson's Paradox—both involve hidden variables distorting conclusions, but confounding obscures a relationship while Simpson's Paradox can actually reverse an apparent relationship when you look at subgroups.
Running many tests on the same data inflates your overall error rate in ways that single-test reasoning doesn't capture.
Compare: Multiple comparison error vs. survivorship bias—multiple comparison error inflates false positives from running too many tests, while survivorship bias creates false conclusions by examining an incomplete dataset. Both require thinking beyond the data you can see.
| Concept | Best Examples |
|---|---|
| Hypothesis testing decisions | Type I Error, Type II Error |
| Error probability factors | Significance level (), Power (), Sample size |
| Sampling and measurement | Sampling error, Selection bias, Measurement error |
| Relationship interpretation | Confounding, Simpson's Paradox, Regression to the mean |
| Multiple testing issues | Multiple comparison error, Survivorship bias |
| Increases power | Larger , larger effect size, larger , smaller variability |
| Cannot be fixed by larger | Selection bias, Confounding, Measurement error |
| Requires randomization to address | Confounding (for causal claims) |
A researcher sets instead of . How does this change affect the probability of Type I error? The probability of Type II error? Explain the trade-off.
Which two errors both involve systematic problems that cannot be reduced by increasing sample size? What distinguishes them from sampling error?
A study finds that a tutoring program improves test scores, but students were enrolled in the program after scoring below average. What statistical phenomenon might explain the improvement even if the program has no effect?
Compare and contrast confounding and Simpson's Paradox. In what way do both involve "hidden" variables, and how do their effects on conclusions differ?
FRQ-style: A pharmaceutical company tests 20 different drug compounds for effectiveness, using for each test. If none of the drugs actually work, approximately how many false positives would you expect? What adjustment could the company make to control the overall Type I error rate?