upgrade
upgrade

🔍AP Research

Validity Threats

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

In AP Research, your ability to design a credible study—and critique the studies of others—hinges on understanding what can go wrong. Validity threats are the hidden landmines that can undermine even the most carefully planned research. When you evaluate sources for your literature review or defend your own methodology, you're being tested on whether you can identify internal validity issues (did the intervention actually cause the effect?) and external validity issues (can these findings apply beyond this specific study?). These concepts appear throughout the AP Research framework, from source evaluation to your own methodological transparency.

Here's the key insight: validity threats aren't just a checklist to memorize. They represent fundamental principles about why research can mislead us—whether through flawed participant selection, uncontrolled variables, or measurement inconsistencies. When you encounter these threats in your own work or in published studies, you need to recognize the underlying mechanism and explain how it compromises conclusions. Don't just memorize the terms—know what type of validity each threat attacks and how researchers attempt to control for it.


These threats stem from who participates in your study and how they change over time. The core principle: your participants must accurately represent the population you're studying, and any changes in them must be attributable to your intervention—not external factors.

Selection Bias

  • Non-representative sampling—occurs when participants don't reflect the larger population, often due to convenience sampling, volunteer bias, or self-selection
  • Limits generalizability of findings; results may only apply to the specific group studied rather than the broader population you're claiming to address
  • Random sampling and stratification are primary controls; acknowledge selection limitations explicitly in your methodology section

Maturation

  • Natural participant changes over time—physical, cognitive, or emotional development that occurs independently of any intervention
  • Particularly problematic in longitudinal designs where weeks or months pass between measurements; younger participants are especially susceptible
  • Control groups help isolate whether observed changes stem from your intervention or would have occurred anyway through normal development

Attrition

  • Participant dropout during the study—threatens validity when those who leave differ systematically from those who remain (differential attrition)
  • Reduces statistical power and can create a biased sample; if struggling participants drop out, your results may overstate intervention effectiveness
  • Track and report dropout rates and reasons; conduct intention-to-treat analysis when possible to account for missing data

Compare: Selection Bias vs. Attrition—both create non-representative samples, but selection bias occurs before data collection while attrition occurs during the study. If an FRQ asks about threats to generalizability, consider whether the problem started at recruitment or emerged later.


Time-Based and External Threats

These threats arise from events or changes that occur during your study period. The core principle: anything happening alongside your intervention could be the real cause of observed effects.

History

  • External events during the study that influence participants' responses—ranging from major events (elections, natural disasters) to local factors (school schedule changes, news coverage)
  • Confounds your intervention's effects because you cannot isolate whether changes resulted from your treatment or the external event
  • Control groups experiencing the same events help; also document the timeline of your study relative to significant occurrences

Testing Effects

  • Prior test exposure influences later performance—includes practice effects (improved scores from familiarity) and test-wiseness (learning how to take the test rather than mastering content)
  • Threatens pre-test/post-test designs where the same or similar instruments are used; participants may improve simply from repetition
  • Use alternate forms of assessments, extend time between measurements, or employ Solomon four-group designs to detect testing effects

Compare: History vs. Maturation—both involve changes over time, but history refers to external events while maturation refers to internal developmental changes in participants. When critiquing longitudinal research, ask: is this change coming from outside or inside the participant?


Measurement and Instrumentation Threats

These threats emerge from how you collect and measure data. The core principle: your measurement tools must remain consistent and accurate throughout the study, or variations in your data may reflect instrument problems rather than real effects.

Instrumentation

  • Changes in measurement tools or procedures—includes equipment drift, observer fatigue, revised scoring rubrics, or inconsistent interview protocols
  • Creates artificial variation in your data that has nothing to do with your actual research question; undermines reliability
  • Calibrate instruments regularly, train all data collectors to consistent standards, and use standardized protocols documented in your methodology

Regression to the Mean

  • Extreme scores naturally move toward average on subsequent measurements—a statistical phenomenon, not a real change
  • Misleads researchers into believing an intervention caused improvement when participants were simply selected for having unusually high or low initial scores
  • Avoid selecting participants based on extreme scores; use control groups and recognize this threat when studying "at-risk" or "high-performing" populations

Compare: Instrumentation vs. Testing Effects—both involve measurement issues, but instrumentation is about changes in your tools while testing effects are about changes in participants due to measurement exposure. One is a researcher problem; the other is a participant response.


Researcher and Participant Behavior Threats

These threats stem from the human element in research—how expectations and social dynamics distort authentic responses. The core principle: both researchers and participants can unconsciously (or consciously) behave in ways that skew results toward expected outcomes.

Experimenter Bias

  • Researcher expectations influence outcomes—can affect how questions are asked, how responses are recorded, how data is coded, and how ambiguous results are interpreted
  • Operates unconsciously in most cases; researchers genuinely believe they're being objective while subtly steering results
  • Blinding protocols (single-blind, double-blind) and standardized procedures minimize influence; having multiple independent coders check reliability also helps

Demand Characteristics

  • Participants detect study purpose and modify behavior accordingly—may try to "help" by confirming hypotheses, rebel against perceived expectations, or perform based on social desirability
  • Compromises ecological validity because participants aren't behaving naturally; particularly problematic in psychology and social science research
  • Deception (with ethical approval), cover stories, and unobtrusive measures reduce demand characteristics; post-study interviews can reveal whether participants guessed the hypothesis

Compare: Experimenter Bias vs. Demand Characteristics—both involve expectations distorting results, but experimenter bias comes from the researcher's behavior while demand characteristics come from participants' behavior. Double-blind designs address the former; careful study design addresses the latter.


Variable Control Threats

This category addresses the fundamental challenge of isolating cause and effect. The core principle: if other variables could explain your results, you cannot claim your independent variable caused the observed changes.

Confounding Variables

  • Third variables that correlate with both IV and DV—create alternative explanations for observed relationships and prevent causal conclusions
  • The central threat to internal validity; even strong correlations mean nothing if confounds aren't addressed (classic example: ice cream sales and drowning deaths both rise in summer—temperature is the confound)
  • Control through randomization, matching, statistical controls (ANCOVA), or explicitly measuring and accounting for potential confounds in your analysis

Compare: Confounding Variables vs. History—both introduce alternative explanations, but confounds are variables that systematically vary with your IV while history refers to discrete external events. Confounds are ongoing; historical threats are time-bound occurrences.


Quick Reference Table

Validity ConceptKey Threats
Internal validity (causation)Confounding variables, History, Maturation, Selection bias
External validity (generalizability)Selection bias, Attrition, Demand characteristics
Measurement reliabilityInstrumentation, Testing effects, Regression to the mean
Researcher objectivityExperimenter bias, Instrumentation (observer drift)
Participant authenticityDemand characteristics, Testing effects, Attrition
Longitudinal design risksMaturation, History, Attrition, Instrumentation
Pre-test/post-test design risksTesting effects, Regression to the mean, Maturation
Statistical interpretationRegression to the mean, Confounding variables

Self-Check Questions

  1. A researcher selects students who scored in the bottom 10% on a math assessment for an intervention, then celebrates when their post-test scores improve. Which two validity threats most likely explain this "improvement" without any real intervention effect?

  2. In a six-month study on a new teaching method, students in the treatment group show significant gains while control group students show modest gains. A major education policy change was announced midway through the study. How would you distinguish between history, maturation, and the actual intervention effect?

  3. Compare and contrast experimenter bias and demand characteristics: What do they have in common, and what strategies address each one differently?

  4. You're evaluating a published study for your literature review. The researchers used a convenience sample of college students and lost 40% of participants before the study concluded, with most dropouts being students who reported struggling with the material. Identify the validity threats and explain how they limit the study's conclusions.

  5. An FRQ asks you to design a study that minimizes threats to internal validity. Which three validity threats would you prioritize addressing, and what specific methodological choices would you make to control for each?