In AP Statistics, bias is a systematic error in how data is collected or how a statistic is calculated that causes results to consistently favor certain outcomes, so estimates miss the true population value in the same direction no matter how many times you repeat the process.
Bias is the statistician's word for "wrong in a predictable direction." The CED defines it directly in Topic 3.4: bias occurs when certain responses are systematically favored over others. The key word is systematically. Random error bounces around the true value and averages out. Bias doesn't. If your sampling method skips part of the population (undercoverage), relies on volunteers (voluntary response), loses certain people who refuse to answer (nonresponse), or asks leading questions (response bias), you'll get answers that are consistently too high or too low.
Bias shows up in two flavors on the AP exam. The first is bias in data collection, which lives in Unit 3 and covers all the bad-sampling scenarios above. The second is bias in an estimator, which lives in Unit 5. An estimator is unbiased if, on average across all possible samples, its value equals the population parameter it's estimating. That's exactly why the sampling distribution of p̂ has mean p and the sampling distribution of x̄ has mean μ. The crucial thing both flavors share is that you cannot fix bias by taking a bigger sample. A bigger biased sample just gives you a more precise wrong answer.
Bias is the thread that ties Unit 3 (Collecting Data) to everything that comes after it. Learning objective AP Stats 3.4.A asks you to identify potential sources of bias in sampling methods, and 3.1.A makes the stakes blunt with its essential knowledge that methods of data collection that don't rely on chance result in untrustworthy conclusions. Then in Unit 5, AP Stats 5.4.A asks you to explain why an estimator is or is not unbiased, which is the same idea translated from study design into sampling distributions. Finally, every inference procedure in Units 6 and 7 quietly assumes bias isn't a problem. The random condition you check under AP Stats 6.2.B before building a confidence interval exists specifically to rule out bias. If the data came from a voluntary response sample, no formula on the formula sheet can rescue the interval. Understanding bias is understanding why all the conditions exist in the first place.
Keep studying AP Statistics Unit 6
Sampling Bias (Unit 3)
Sampling bias is the specific family of biases covered in Topic 3.4, including voluntary response, undercoverage, and nonresponse bias. Each one means some part of the population gets over- or under-represented before anyone even answers a question. This is the term the exam expects you to name and explain in context.
Biased and Unbiased Point Estimates (Unit 5)
Topic 5.4 reuses the word bias for estimators instead of samples. An estimator is unbiased when the mean of its sampling distribution equals the parameter, which is why μp̂ = p and μx̄ = μ are such a big deal in Topics 5.5 and 5.7. Same idea as sampling bias, just one level up. The question becomes whether the statistic itself, not the sample, systematically misses.
Confidence Interval Conditions (Unit 6)
When Topic 6.2 makes you verify that data came from a random sample before building a z-interval for a proportion, it's really making you certify the data isn't biased. A confidence interval's margin of error only accounts for random sampling variability. It says nothing about bias, so a biased sample produces an interval that confidently captures the wrong number.
Confounding Variable (Unit 3)
Confounding is the experiment-side cousin of bias. Where bias distorts who's in your sample, a confounding variable distorts what caused the response in an observational study or poorly designed experiment. Random sampling fights bias; random assignment fights confounding. The exam loves checking whether you know which tool fixes which problem.
Bias is mostly tested as a "spot the flaw" skill. Multiple-choice questions describe a study and ask you to name the source of bias or pick the scenario where a standard confidence interval procedure would be least appropriate, like an interval built from a non-random or volunteer sample. On FRQs, bias appears inside study-design questions. The 2018 FRQ Q2 had a teacher estimating the proportion of students who recycle, where you had to evaluate the sampling method, and the 2021 FRQ Q2 hinged on what a random sample of 100 adults lets you generalize about walking and cholesterol. Two moves earn points. First, name the bias precisely (say "undercoverage bias," not just "the sample is bad"). Second, give the direction in context, explaining whether the estimate is likely too high or too low and why. "The survey is biased" with no direction or context scores nothing.
Bias and variability are the two separate ways an estimate can go wrong, and the exam tests whether you keep them apart. Bias means the estimates are systematically off-center, consistently missing the true value in one direction. Variability means the estimates are spread out around wherever they're centered. Picture darts. Bias is aiming at the wrong spot; variability is shaky hands. Increasing sample size reduces variability (the σ/√n effect from Topic 5.7) but does absolutely nothing to fix bias. A huge voluntary response sample is still biased, just very precisely biased.
Bias is systematic error, meaning the results are consistently pushed in one direction rather than randomly scattered around the truth.
The four named sampling biases in the CED are voluntary response bias, undercoverage bias, nonresponse bias, and response bias, and you should be able to identify each from a study description.
Increasing the sample size reduces variability but never fixes bias, so a large biased sample just gives a precise wrong answer.
An estimator is unbiased when the mean of its sampling distribution equals the population parameter, which is why x̄ and p̂ are unbiased estimators of μ and p under random sampling.
On FRQs, full credit requires naming the specific type of bias and stating the likely direction of the error in the context of the problem.
The random sampling condition for confidence intervals and tests exists to rule out bias, because the margin of error only accounts for random variation, not systematic error.
Bias is systematic error in data collection or estimation that causes results to consistently favor certain outcomes, so your statistic tends to miss the true population value in the same direction every time. The CED defines it in Topic 3.4 as occurring when certain responses are systematically favored over others.
No. Sample size reduces variability (standard deviation shrinks by the σ/√n factor), but bias comes from how the sample was selected, not how big it is. A voluntary response sample of 10,000 people is just as biased as one of 100.
Bias is about who ends up in your sample and how they respond, so it threatens generalization. A confounding variable is a third variable tangled up with the treatment, so it threatens causal conclusions. Random sampling addresses bias; random assignment addresses confounding.
Topic 3.4 names voluntary response bias (volunteers self-select), undercoverage bias (part of the population has a reduced chance of selection), nonresponse bias (selected individuals can't be reached or refuse), and response bias (question wording or social pressure distorts answers).
An estimator is unbiased if, on average across all possible random samples, its value equals the population parameter (learning objective AP Stats 5.4.A). For example, the sampling distribution of p̂ has mean exactly equal to p, which makes p̂ an unbiased estimator of the population proportion.