Sampling Bias

Sampling bias occurs when a sampling method makes some individuals in a population more likely to be selected than others, producing an unrepresentative sample whose statistics systematically misestimate the population parameter, no matter how large the sample is.

Verified for the 2027 AP Statistics examโ€ขLast updated June 2026

What is Sampling Bias?

Sampling bias is a flaw in HOW you select your sample, not in the data itself. If your method gives certain individuals a better chance of being chosen (say, you survey shoppers at one mall to estimate opinions for an entire city), the sample will systematically lean in one direction. Your statistic, like a sample proportion pฬ‚ or a sample mean xฬ„, will consistently over- or under-shoot the true parameter.

Here's the part that trips people up. Sampling bias is not fixed by collecting more data. A biased method with 10,000 responses is still biased, just precisely wrong instead of vaguely wrong. That's why random sampling matters so much in AP Stats. Randomness gives every individual a known, fair chance of selection, which is exactly what the inference conditions in Units 6 and 7 demand. Every confidence interval and significance test you build assumes the data came from a random sample or randomized experiment. Sampling bias is what happens when that assumption fails before you've even calculated anything.

Why Sampling Bias matters in AP Statistics

Sampling bias is the reason the "random" condition exists in every inference procedure. When you verify conditions for a two-sample z-interval for a difference in proportions (AP Stats 6.8.B), a two-sample z-test for proportions (AP Stats 6.10.C), or a two-sample t-test for means (AP Stats 7.8.C), the first thing you check is that data came from independent random samples or a randomized experiment. That check is your guarantee against sampling bias. If the sampling method is biased, the sampling distribution of pฬ‚โ‚ - pฬ‚โ‚‚ or xฬ„โ‚ - xฬ„โ‚‚ isn't centered at the true difference, so your interval can miss the parameter and your p-value (AP Stats 6.5.B) describes a hypothetical world that doesn't match your actual data. In short, sampling bias breaks the bridge between sample and population. Inference becomes math with no meaning.

How Sampling Bias connects across the course

Random Sampling (Unit 3)

Random sampling is the cure for sampling bias. Giving every individual a known chance of selection removes the systematic favoritism that biased methods create. This is the single most important connection, because the random condition you verify in inference is just Unit 3's anti-bias rule showing up again.

Conditions for Two-Sample Inference (Units 6 & 7)

Topics 6.8, 6.10, and 7.8 all require independent random samples before you compute anything. When an FRQ asks you to verify conditions, you're really certifying that sampling bias isn't poisoning the comparison between the two groups.

Nonresponse Bias (Unit 3)

Sampling bias and nonresponse bias both wreck representativeness, but at different stages. Sampling bias happens when you choose who gets asked. Nonresponse bias happens when chosen people don't answer. A perfectly random sample can still suffer nonresponse bias.

Confidence Interval (Units 6 & 7)

A confidence interval's coverage guarantee (like 95%) only holds if the sample was collected without bias. With sampling bias, the interval is centered in the wrong place, so the stated confidence level is fiction.

Is Sampling Bias on the AP Statistics exam?

Sampling bias shows up two main ways. In multiple choice, you'll see a sampling scenario (volunteers, convenience samples, people surveyed outside a gym about exercise habits) and you have to identify why the method is biased and which direction the estimate likely shifts. On FRQs, it lives inside scope-of-inference and conditions questions. When a problem says "verify the conditions" for a two-sample interval or test, naming the random sample requirement is partly about ruling out sampling bias. And when a question asks whether results can be generalized to a larger population, the expected answer hinges on whether the sample was randomly selected. No released FRQ requires the exact phrase "sampling bias," but the underlying idea is graded constantly. One key warning for free response: never claim that a larger sample fixes bias. That answer loses credit every time.

Sampling Bias vs Nonresponse Bias

Sampling bias comes from the selection method itself, where some individuals never had a fair chance of being chosen. Nonresponse bias comes after selection, when chosen individuals fail to respond and the responders differ from the non-responders. Quick test: ask where the problem starts. Bad method of choosing means sampling bias. People refusing or unreachable after a fair selection means nonresponse bias.

Key things to remember about Sampling Bias

  • Sampling bias occurs when the selection method makes some individuals more likely to be chosen, so the sample systematically misrepresents the population.

  • Increasing the sample size does not reduce sampling bias; a biased method just produces a more precise wrong answer.

  • Random sampling is the fix, which is why every inference condition check in Units 6 and 7 starts with confirming a random sample or randomized experiment.

  • If a sample is biased, confidence intervals are centered in the wrong place and p-values are meaningless, even if every calculation is correct.

  • Sampling bias happens during selection, while nonresponse bias happens after selection when chosen individuals don't respond.

  • On scope-of-inference questions, you can only generalize results to the population the sample was randomly drawn from.

Frequently asked questions about Sampling Bias

What is sampling bias in AP Stats?

Sampling bias is a flaw in the sampling method that makes certain individuals more likely to be selected, producing an unrepresentative sample. It causes statistics like pฬ‚ and xฬ„ to systematically over- or under-estimate the population parameter.

Does a bigger sample size fix sampling bias?

No. Sample size affects variability, not bias. A biased method with 10,000 people still misses the parameter in the same direction, just with a smaller margin of error around the wrong answer. This is a classic AP trap answer.

How is sampling bias different from nonresponse bias?

Sampling bias happens when the selection method itself is unfair, like surveying only people at a mall. Nonresponse bias happens when fairly selected individuals don't respond, and responders differ from non-responders. A random sample avoids the first but can still suffer the second.

Why does sampling bias matter for confidence intervals and hypothesis tests?

Inference procedures like the two-sample z-interval (Topic 6.8) and two-sample t-test (Topic 7.8) assume data come from random samples. With sampling bias, the sampling distribution isn't centered at the true parameter, so confidence levels and p-values no longer mean what they claim.

Is a convenience sample the same thing as sampling bias?

Not exactly. A convenience sample is a method (sampling whoever is easiest to reach), and sampling bias is the result it usually produces. On the exam, identifying that a convenience or voluntary response sample likely creates sampling bias is the expected reasoning.