upgrade
upgrade

📊Causal Inference

Key Concepts of Quasi-Experimental Designs

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

When randomized controlled trials aren't possible—due to ethical constraints, cost, or practical limitations—quasi-experimental designs become your primary toolkit for establishing causality. You're being tested on your ability to identify when each design is appropriate, what assumptions must hold, and how threats to validity differ across methods. These aren't just abstract techniques; they're the workhorses behind policy evaluation, program assessment, and empirical research in economics, public health, and social sciences.

Understanding these designs means recognizing that causal inference is fundamentally about ruling out alternative explanations. Each method addresses confounding in a different way—some exploit timing, others leverage cutoffs, and still others rely on external variation. Don't just memorize definitions—know what identifying assumption each design requires and what would cause it to fail.


Designs That Exploit Timing

These methods leverage the structure of when interventions occur to separate causal effects from pre-existing trends. The key insight: if you can observe the same units (or comparable units) before and after treatment, you can difference out confounding factors that don't change over time.

Difference-in-Differences (DiD)

  • Compares changes over time between treatment and control groups—not just levels, but the difference in differences
  • Parallel trends assumption—in the absence of treatment, both groups would have followed the same trajectory; this is the critical identifying assumption
  • Policy evaluation workhorse—ideal for assessing interventions like minimum wage changes or policy rollouts where randomization is impossible

Interrupted Time Series (ITS)

  • Multiple observations before and after—analyzes trends across many time points to detect changes in level or slope following intervention
  • Single-group design—doesn't require a control group, making it useful when comparison groups don't exist
  • Threats from concurrent events—any other change occurring at the intervention time (history threat) can bias results

Compare: DiD vs. ITS—both exploit timing, but DiD requires a control group to difference out time trends while ITS relies on extrapolating pre-intervention trends. If you have a strong comparison group, DiD is preferred; if you have rich time-series data but no control, ITS is your fallback.


Designs That Exploit Thresholds and Cutoffs

These methods identify causal effects by comparing units just above and below an arbitrary threshold. The logic: units near the cutoff are essentially randomly assigned to treatment, creating local randomization.

Regression Discontinuity Design (RDD)

  • Exploits a cutoff rule—treatment is assigned based on whether a running variable (test score, age, income) falls above or below a threshold
  • Local average treatment effect (LATE)—estimates are valid only for units near the cutoff, limiting external validity
  • Manipulation threat—if units can precisely control their running variable to sort around the cutoff, the design fails; always test for bunching

Instrumental Variables (IV)

  • Uses external variation—an instrument affects treatment assignment but has no direct effect on the outcome (exclusion restriction)
  • Two key conditions—the instrument must be (1) relevant (correlated with treatment) and (2) exogenous (uncorrelated with the error term)
  • Addresses endogeneity—solves problems of reverse causality and omitted variable bias when valid instruments exist

Compare: RDD vs. IV—both provide causal estimates without randomization, but RDD exploits a known assignment rule while IV exploits external variation. RDD gives you a clear visual test (plot the discontinuity); IV validity is harder to verify since the exclusion restriction cannot be directly tested.


Designs That Construct Comparison Groups

When natural comparison groups don't exist, these methods create them statistically. The goal: balance observable characteristics between treated and control units to approximate what randomization would achieve.

Propensity Score Matching (PSM)

  • Matches on treatment probability—the propensity score (estimated likelihood of receiving treatment) summarizes all observed covariates into a single number
  • Selection on observables—assumes that after conditioning on the propensity score, treatment assignment is independent of potential outcomes
  • Common support requirement—matching only works where treated and control units have overlapping propensity scores; check for overlap before proceeding

Matching Methods (General)

  • Pairs similar units—techniques include nearest neighbor, caliper matching, and exact matching on key covariates
  • Reduces selection bias—creates a balanced comparison group that mimics random assignment for observed characteristics
  • Cannot address unobservables—unlike IV or DiD, matching only controls for what you can measure; hidden confounders remain a threat

Synthetic Control Method

  • Constructs a weighted counterfactual—combines multiple control units to create a synthetic version of the treated unit that matches its pre-treatment trajectory
  • Ideal for single-unit studies—when only one state, country, or organization receives treatment, traditional methods fail; synthetic control fills this gap
  • Requires good donor pool—control units must be unaffected by treatment and similar enough to the treated unit to form a credible synthetic match

Compare: PSM vs. Synthetic Control—both construct comparison groups, but PSM matches individual units while synthetic control creates a weighted composite. Use PSM when you have many treated units; use synthetic control when you're studying a single case (e.g., one state's policy change).


Designs That Exploit Natural Variation

These approaches leverage real-world events that create quasi-random variation in treatment exposure. Nature or policy inadvertently runs the experiment for you.

Natural Experiments

  • Exploits exogenous shocks—events like lottery assignments, weather disasters, or policy changes that affect some groups but not others create as-if random variation
  • Credibility depends on context—you must argue convincingly that the variation is unrelated to potential outcomes; this is often contested
  • Foundation for other methods—natural experiments often provide the instruments for IV or the treatment variation for DiD

Fixed Effects Models

  • Controls for time-invariant confounders—by examining within-unit variation over time, fixed effects eliminate all stable unobserved characteristics
  • Panel data requirement—needs repeated observations of the same units (individuals, firms, countries) across multiple periods
  • Cannot address time-varying confounders—factors that change over time and correlate with both treatment and outcome remain problematic

Compare: Natural Experiments vs. Fixed Effects—natural experiments identify causal effects through external variation, while fixed effects control for stable confounders through within-unit comparisons. Natural experiments are about finding variation; fixed effects are about controlling for unobservables.


Qualitative and Mixed Approaches

Not all causal inference is quantitative. These methods provide depth and context that statistical approaches may miss.

Comparative Case Studies

  • In-depth analysis of mechanisms—examines how and why causal processes operate, not just whether effects exist
  • Process tracing—follows the causal chain step-by-step to identify mechanisms and rule out alternative explanations
  • Hypothesis generation—particularly valuable early in research when theory is underdeveloped; findings can motivate subsequent quantitative work

Compare: Comparative Case Studies vs. Quantitative Quasi-Experiments—case studies prioritize internal validity and mechanistic understanding within specific contexts, while quantitative methods prioritize estimating average effects across populations. Use case studies to understand why an effect occurs; use quantitative methods to estimate how large it is.


Quick Reference Table

ConceptBest Examples
Exploits timing/trendsDiD, Interrupted Time Series
Exploits cutoffs/thresholdsRDD, IV
Constructs comparison groupsPSM, Matching Methods, Synthetic Control
Controls for unobservablesFixed Effects, IV, DiD
Single-unit or small-N studiesSynthetic Control, Comparative Case Studies
Requires parallel trendsDiD
Requires valid instrumentIV
Selection on observables onlyPSM, Matching Methods

Self-Check Questions

  1. Both DiD and ITS exploit timing to identify causal effects. What is the key difference in their data requirements, and when would you choose one over the other?

  2. A researcher wants to estimate the effect of a scholarship program that's awarded to students scoring above 80 on an entrance exam. Which design is most appropriate, and what threat to validity should they test for?

  3. Compare propensity score matching and instrumental variables: which assumption is stronger, and why might a researcher prefer IV despite its stricter requirements?

  4. You're asked to evaluate a smoking ban implemented in one state. You have data on multiple states over 10 years. Which two methods could you use, and what are the tradeoffs between them?

  5. FRQ-style: A policy analyst claims that fixed effects models "solve" the problem of omitted variable bias. Explain why this claim is only partially correct, identifying what types of confounders fixed effects can and cannot address.