Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Every AP Statistics exam tests your ability to identify what went wrong in a study—and sampling bias is the most common culprit. When you're analyzing an experimental design or critiquing a survey in an FRQ, you need to pinpoint exactly which bias is present and explain why it threatens the validity of the conclusions. The good news? These biases follow predictable patterns based on how samples are selected, who responds, and how questions are asked.
Understanding sampling biases isn't just about memorizing a list of terms. You're being tested on your ability to recognize how bias enters a study and what effect it has on generalizability. Can the results be extended to the entire population, or did something in the sampling process systematically favor certain groups? Master the underlying mechanisms—who's missing, who's overrepresented, and why—and you'll be able to tackle any scenario the exam throws at you.
These biases occur before any data is collected—something goes wrong in how or from whom the sample is drawn. The core issue is that the sampling process itself creates systematic differences between your sample and the target population.
Compare: Undercoverage bias vs. Sampling frame bias—both involve missing groups, but undercoverage can occur even with a good frame (through bad sampling), while frame bias means the list itself is flawed. On FRQs, identify where the problem originates: the list or the selection process.
Even with a perfectly selected sample, bias can enter when who actually participates differs systematically from who was selected. These biases emerge from the gap between your intended sample and your actual respondents.
Compare: Non-response bias vs. Voluntary response bias—both involve who chooses to participate, but non-response starts with a random sample where some don't respond, while voluntary response never had random selection to begin with. Non-response can sometimes be addressed; voluntary response is fundamentally flawed by design.
These biases occur during data collection itself. The sample may be perfectly representative, but how people respond introduces systematic errors. The issue shifts from "who's in the sample" to "what are they telling us."
Compare: Social desirability bias vs. Response bias—social desirability is a specific type of response bias driven by wanting to appear favorable. On exams, use the more specific term when the scenario clearly involves sensitive topics or impression management.
This category involves drawing conclusions from a sample that, while real, doesn't represent what we think it does. The data is accurate for who we measured, but we're missing crucial information about who we didn't.
Compare: Survivorship bias vs. Undercoverage bias—both involve missing groups, but undercoverage means certain groups were never sampled, while survivorship means they were part of the original group but "disappeared" before measurement. Survivorship often occurs in longitudinal studies or historical analyses.
| Concept | Best Examples |
|---|---|
| Sample selection problems | Selection bias, Undercoverage, Sampling frame bias, Convenience sampling |
| Participation-related | Non-response bias, Voluntary response bias |
| Response accuracy | Response bias, Social desirability bias, Recall bias |
| Missing data patterns | Survivorship bias, Non-response bias |
| Flawed sampling frame | Sampling frame bias, Undercoverage bias |
| Self-selection issues | Voluntary response bias, Convenience sampling bias |
| Threatens generalizability | All of them—but especially selection, undercoverage, and voluntary response |
| Threatens reliability | Response bias, Social desirability bias, Recall bias |
A researcher surveys patients currently enrolled in a weight-loss program about their eating habits from the past year. Identify two distinct biases that could affect these results and explain how each would distort the findings.
Which two biases both involve people choosing whether to participate, and what is the key difference in how the original sample was selected?
An online poll asks readers whether they support a new city policy, and 89% of 2,000 respondents say "no." A city council member claims this proves residents oppose the policy. What bias is present, and why does the large sample size not fix the problem?
Compare sampling frame bias and undercoverage bias. If a researcher uses a 2015 phone directory to survey a city's current residents, which bias (or both) is present? Explain your reasoning.
A study finds that entrepreneurs who took a specific business course have higher success rates than average. Before concluding the course is effective, what bias should you consider, and what additional information would you need to evaluate the claim?