Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Bias introduces systematic error into research, and biostatistics exams will test whether you can identify when, why, and how different biases distort study findings. You're not just being tested on definitions. You need to understand the mechanisms that cause the error and recognize whether a bias affects who gets into a study, how data is collected, or how results are interpreted and shared.
The biases you'll encounter fall into categories based on where in the research process they occur: participant selection, data collection, analysis, and dissemination. Knowing these categories helps you quickly diagnose problems in study design and propose fixes. Don't just memorize names. Know what stage of research each bias threatens and what strategies prevent it.
These biases occur before data collection even begins. When the sample doesn't accurately represent the target population, external validity collapses no matter how rigorous the rest of the study.
Selection bias happens when the people included in a study systematically differ from the population you're trying to learn about. This threatens generalizability because your results may only apply to the specific (non-representative) group you studied.
Sampling bias arises from how you recruit participants. If you use non-random selection or convenience sampling (say, recruiting only from one hospital), the people who enter your study won't reflect the broader population.
Attrition bias occurs when participants drop out of a study for reasons related to the exposure or outcome being studied. The key word is "non-random." If sicker patients quit a drug trial because of side effects, the remaining participants look healthier than they should.
Compare: Selection bias vs. Sampling bias. Both affect who's in your study, but selection bias refers to systematic differences in how participants are chosen or enrolled, while sampling bias specifically involves flawed sampling techniques. If an exam question describes a convenience sample, think sampling bias first.
These biases corrupt the accuracy of measurements after participants are enrolled. Systematic errors in how exposure or outcome data are gathered lead to misclassification, either differential (varying by group) or non-differential (equal across groups).
Information bias is the broad category for systematic errors in measuring exposures or outcomes. The core mechanism is misclassification: putting someone in the wrong exposure or outcome category because of faulty instruments, inconsistent protocols, or inaccurate records.
Recall bias occurs when participants inaccurately remember past exposures. It's especially problematic in retrospective studies where you're asking people to think back months or years.
Observer bias occurs when a researcher's expectations influence how they measure or interpret outcomes. If an investigator knows which patients received the treatment, they may unconsciously rate those patients as improved.
Compare: Recall bias vs. Observer bias. Both involve subjective distortion, but recall bias originates with participants misremembering, while observer bias originates with researchers misinterpreting. Blinding fixes observer bias; using objective records (rather than self-report) fixes recall bias.
These biases affect how relationships between variables are understood. Even with perfect selection and measurement, failing to account for extraneous variables or time-related artifacts can produce misleading conclusions.
Confounding occurs when a third variable is associated with both the exposure and the outcome, creating a spurious (fake) association between them. The confounding variable provides an alternative explanation for your results.
Classic example: Early studies found an association between coffee drinking and lung cancer. But coffee drinkers were more likely to smoke, and smoking causes lung cancer. Smoking was the confounder, and once you accounted for it, the coffee-cancer link disappeared.
Four key strategies to address confounding:
Lead-time bias is a screening artifact. It occurs when earlier detection through screening appears to extend survival without actually changing the course of the disease.
Here's why it happens: survival time is measured from the point of diagnosis. If screening detects a cancer at age 55 instead of symptoms appearing at age 60, and the patient dies at age 68 either way, the screened patient looks like they survived 13 years while the unscreened patient survived only 8. The screening didn't add life; it just added time spent knowing about the disease.
Mortality rates (deaths per population over a time period) provide a more accurate measure of whether a screening program actually saves lives, because they don't depend on when diagnosis occurs.
Compare: Confounding bias vs. Information bias. Confounding involves a real third variable distorting the exposure-outcome relationship, while information bias involves measurement error in the exposure or outcome itself. Confounding can often be addressed during analysis with statistical adjustment; information bias generally cannot be fixed after data collection.
These biases occur after studies are completed, affecting what evidence reaches the scientific community. When published literature doesn't reflect all conducted research, systematic reviews and meta-analyses inherit distorted effect estimates.
Reporting bias occurs within a single study when researchers emphasize certain results while downplaying or omitting others based on statistical significance or the direction of findings.
Publication bias operates across studies. Journals are more likely to publish studies with significant or favorable findings, while null or negative results often go unpublished.
Compare: Reporting bias vs. Publication bias. Reporting bias occurs within a study (selective presentation of outcomes) and is author-driven. Publication bias occurs across studies (selective publication of entire studies) and involves editorial and systemic factors. Both distort the evidence base.
| Concept | Best Examples |
|---|---|
| Participant selection problems | Selection bias, Sampling bias, Attrition bias |
| Measurement/data collection errors | Information bias, Recall bias, Observer bias |
| Third-variable distortion | Confounding bias |
| Time-related artifacts | Lead-time bias |
| Dissemination distortion | Reporting bias, Publication bias |
| Mitigated by blinding | Observer bias, Information bias |
| Mitigated by randomization | Confounding bias, Selection bias |
| Threatens external validity | Sampling bias, Selection bias, Attrition bias |
A case-control study finds that mothers of children with autism report higher pesticide exposure than mothers of healthy children. Which two biases could explain this finding, and how would you distinguish between them?
Researchers notice that participants who drop out of a weight-loss trial had higher baseline BMIs than completers. What type of bias does this represent, and how might it affect the study's conclusions?
Compare and contrast confounding bias and information bias: at what stage of research does each occur, and which can be corrected during statistical analysis?
A new cancer screening test shows 5-year survival rates of 85% compared to 60% for unscreened patients. A biostatistician argues this doesn't prove the screening saves lives. What bias is she concerned about, and what alternative measure would provide stronger evidence?
A meta-analysis of antidepressant trials shows that published studies report larger effect sizes than unpublished FDA submissions. Which bias does this demonstrate, and what graphical tool could have detected it?