upgrade
upgrade

🐛Biostatistics

Types of Bias in Research

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Bias is the silent saboteur of research validity, and biostatistics exams will test whether you can identify when, why, and how different biases distort study findings. You're not just being tested on definitions—you need to understand the underlying mechanisms that introduce systematic error into research. This means recognizing whether a bias affects who gets into a study, how data is collected, or how results are interpreted and shared.

The biases you'll encounter fall into distinct categories based on where in the research process they occur: participant selection, data collection, analysis, and dissemination. Mastering these categories helps you quickly diagnose problems in study design and propose solutions—exactly what FRQs demand. Don't just memorize names—know what stage of research each bias threatens and what strategies prevent it.


Biases in Participant Selection

These biases occur before data collection even begins. When the sample doesn't accurately represent the target population, external validity collapses—no matter how rigorous the rest of the study.

Selection Bias

  • Non-representative participants—occurs when those included in a study systematically differ from the general population, threatening generalizability
  • Case-control studies are particularly vulnerable when cases and controls differ in characteristics beyond the exposure of interest
  • Volunteer bias is a common subtype where self-selected participants have different risk profiles than non-volunteers

Sampling Bias

  • Flawed sampling methods—arises when non-random selection or convenience sampling skews who enters the study
  • Self-selection occurs when participants choose whether to join, often resulting in healthier or more motivated samples
  • External validity suffers most, meaning findings may not apply beyond the specific study population

Attrition Bias

  • Non-random dropout—occurs when participants leave a study for reasons related to the exposure or outcome being studied
  • Differential loss to follow-up between groups can create systematic differences that bias effect estimates
  • Intention-to-treat analysis helps mitigate this by analyzing participants in their original assigned groups regardless of completion

Compare: Selection bias vs. Sampling bias—both affect who's in your study, but selection bias refers to systematic differences in how participants are chosen or enrolled, while sampling bias specifically involves flawed sampling techniques. If an FRQ describes a convenience sample, think sampling bias first.


Biases in Data Collection

These biases corrupt the accuracy of measurements after participants are enrolled. Systematic errors in how exposure or outcome data are gathered lead to misclassification—either differential (varying by group) or non-differential (equal across groups).

Information Bias

  • Measurement inaccuracy—arises when data collected about exposures or outcomes contains systematic errors
  • Misclassification of exposure or outcome status is the core mechanism, whether from faulty instruments or inconsistent protocols
  • Non-differential misclassification typically biases results toward the null, while differential misclassification can bias in either direction

Recall Bias

  • Memory-dependent errors—occurs when participants inaccurately remember past exposures, especially problematic in retrospective studies
  • Differential recall happens when cases search their memories more thoroughly than controls (e.g., mothers of children with birth defects recalling medication use)
  • Case-control studies relying on self-reported historical data are most vulnerable to this bias

Observer Bias

  • Researcher expectations influence assessment—occurs when investigators' preconceptions affect how they measure or interpret outcomes
  • Subjective endpoints like pain scales or disease severity ratings are particularly susceptible
  • Blinding observers to group assignments is the primary prevention strategy

Compare: Recall bias vs. Observer bias—both involve subjective distortion, but recall bias originates with participants misremembering, while observer bias originates with researchers misinterpreting. Blinding helps with observer bias; using objective records (rather than self-report) helps with recall bias.


Biases in Analysis and Interpretation

These biases affect how relationships between variables are understood. Even with perfect selection and measurement, failing to account for extraneous variables or time-related artifacts can produce misleading conclusions.

Confounding Bias

  • Third-variable problem—occurs when an extraneous variable is associated with both the exposure and outcome, creating a spurious association
  • Classic example: the coffee-lung cancer association confounded by smoking (coffee drinkers smoke more)
  • Randomization, restriction, matching, and statistical adjustment (e.g., multivariable regression) are key mitigation strategies

Lead-Time Bias

  • Screening artifact—occurs when earlier detection through screening appears to extend survival without actually changing disease course
  • Survival time is measured from diagnosis, so detecting disease earlier automatically adds time even if death occurs at the same age
  • Mortality rates (rather than survival time from diagnosis) provide a more accurate measure of screening program effectiveness

Compare: Confounding bias vs. Information bias—confounding involves a real third variable distorting the exposure-outcome relationship, while information bias involves measurement error in the exposure or outcome itself. Confounding can be addressed in analysis; information bias cannot be fixed after data collection.


Biases in Research Dissemination

These biases occur after studies are completed, affecting what evidence reaches the scientific community. When published literature doesn't reflect all conducted research, systematic reviews and meta-analyses inherit distorted effect estimates.

Reporting Bias

  • Selective outcome reporting—occurs when researchers emphasize certain results while downplaying or omitting others based on statistical significance or direction
  • Outcome switching happens when primary endpoints are changed after seeing results, inflating apparent treatment effects
  • Trial registration requirements (e.g., ClinicalTrials.gov) help prevent this by documenting planned outcomes prospectively

Publication Bias

  • Positive-results preference—refers to journals' tendency to publish significant or favorable findings over null or negative results
  • File-drawer problem describes unpublished negative studies that never enter the evidence base
  • Funnel plots in meta-analyses can detect this bias by showing asymmetry in effect sizes across studies of varying precision

Compare: Reporting bias vs. Publication bias—reporting bias occurs within a study (selective presentation of outcomes), while publication bias occurs across studies (selective publication of entire studies). Both distort the evidence base, but reporting bias is author-driven while publication bias involves editorial and systemic factors.


Quick Reference Table

ConceptBest Examples
Participant selection problemsSelection bias, Sampling bias, Attrition bias
Measurement/data collection errorsInformation bias, Recall bias, Observer bias
Third-variable distortionConfounding bias
Time-related artifactsLead-time bias
Dissemination distortionReporting bias, Publication bias
Mitigated by blindingObserver bias, Information bias
Mitigated by randomizationConfounding bias, Selection bias
Threatens external validitySampling bias, Selection bias, Attrition bias

Self-Check Questions

  1. A case-control study finds that mothers of children with autism report higher pesticide exposure than mothers of healthy children. Which two biases could explain this finding, and how would you distinguish between them?

  2. Researchers notice that participants who drop out of a weight-loss trial had higher baseline BMIs than completers. What type of bias does this represent, and how might it affect the study's conclusions?

  3. Compare and contrast confounding bias and information bias: at what stage of research does each occur, and which can be corrected during statistical analysis?

  4. A new cancer screening test shows 5-year survival rates of 85% compared to 60% for unscreened patients. A biostatistician argues this doesn't prove the screening saves lives. What bias is she concerned about, and what alternative measure would provide stronger evidence?

  5. A meta-analysis of antidepressant trials shows that published studies report larger effect sizes than unpublished FDA submissions. Which bias does this demonstrate, and what graphical tool could have detected it?