📊Causal Inference

Key Concepts of Regression Discontinuity Designs

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Regression Discontinuity Design sits at the heart of modern causal inference because it solves a fundamental problem: how do we estimate causal effects when we can't randomly assign treatment? RDD exploits arbitrary cutoffs—test score thresholds, age limits, income eligibility lines—to create conditions that approximate a randomized experiment. You're being tested on your ability to recognize when RDD is appropriate, understand what makes it credible, and identify threats to its validity.

The concepts here connect directly to broader themes in causal inference: identification strategies, local versus average treatment effects, and the bias-variance tradeoff in estimation. Don't just memorize that "bandwidth matters" or "manipulation is bad"—understand why observations near a cutoff serve as valid counterfactuals and what assumptions must hold for that logic to work. These distinctions will drive your FRQ responses and help you critically evaluate research designs.

The Core Logic: Exploiting Discontinuities

RDD works because individuals just above and below a cutoff are essentially randomly assigned to treatment—at least locally. The key insight is that if the cutoff is arbitrary, people on either side should be comparable in all ways except their treatment status.

Definition and Basic Concept of RDD

Quasi-experimental design exploiting a cutoff—the running variable (also called the forcing variable) determines treatment assignment at a specific threshold
Local randomization assumption means individuals just above and below the cutoff are statistically identical, differing only in treatment receipt
Identifies causal effects without randomization, making RDD one of the most credible observational methods when its assumptions hold

Sharp vs. Fuzzy RDD

Sharp RDD has deterministic assignment—crossing the threshold guarantees treatment (e.g., everyone scoring ≥70 gets the scholarship)
Fuzzy RDD involves probabilistic assignment where the cutoff changes the probability of treatment but doesn't determine it perfectly (think of it as an intention-to-treat scenario)
Fuzzy designs require IV-style estimation, using the cutoff as an instrument; this estimates a Local Average Treatment Effect (LATE) for compliers

Compare: Sharp RDD vs. Fuzzy RDD—both exploit the same cutoff logic, but sharp designs estimate treatment effects directly while fuzzy designs estimate effects only for those whose treatment status changed because of the threshold. If an FRQ describes imperfect compliance with a cutoff rule, you're dealing with fuzzy RDD.

Validity Requirements: What Must Hold

The credibility of any RDD hinges on whether the discontinuity in treatment is the only thing changing at the cutoff. These assumptions determine whether your causal claims are defensible.

Assumptions and Requirements for Valid RDD

Continuity assumption requires that potential outcomes are smooth functions of the running variable—no other discontinuities exist at the cutoff
No manipulation means individuals cannot precisely control their running variable to sort above or below the threshold
Running variable must be continuous and treatment assignment must depend solely on its relationship to the cutoff

Testing for Manipulation of the Running Variable

McCrary density test examines whether observations "bunch" suspiciously at the cutoff—a spike suggests people are gaming the system
Covariate balance checks verify that predetermined characteristics don't jump at the threshold (if they do, something other than treatment is changing)
Manipulation invalidates the design entirely because it means people on either side of the cutoff are systematically different

Compare: Manipulation testing vs. Covariate balance—both assess validity but target different threats. Density tests catch sorting into treatment; covariate checks catch discontinuities in baseline characteristics. Strong RDD papers report both.

Estimation Choices: The Bias-Variance Tradeoff

How you estimate the treatment effect matters enormously. The core tension: using more data reduces variance but risks bias from observations far from the cutoff.

Bandwidth Selection and Local Linear Regression

Bandwidth defines the analysis window—only observations within this range of the cutoff contribute to estimation
Narrower bandwidths reduce bias by focusing on observations most similar to each other, but increase variance due to smaller sample sizes
Optimal bandwidth selection methods (like Imbens-Kalyanaraman or Calonico-Cattaneo-Titiunik) balance this tradeoff systematically

Graphical Analysis and Visualization in RDD

Binned scatter plots show average outcomes at different values of the running variable—a visible "jump" at the cutoff signals a treatment effect
Fitted regression lines on either side of the cutoff help visualize the estimated discontinuity and functional form assumptions
Graphs serve as transparency devices, allowing readers to assess whether the effect is real or an artifact of modeling choices

Compare: Wide bandwidth vs. Narrow bandwidth—wider bandwidths give you more statistical power but assume the relationship between the running variable and outcome is correctly specified far from the cutoff. Narrow bandwidths are more credible but noisier. Always report results across multiple bandwidths.

Robustness and Limitations: Stress-Testing Your Results

No single RDD estimate should be taken at face value. Credible research demonstrates that findings survive alternative specifications and acknowledges inherent limitations.

Sensitivity Analysis and Robustness Checks

Vary the bandwidth to show results aren't driven by a single arbitrary choice—effects should be stable across reasonable ranges
Test different functional forms (linear, quadratic, polynomial) to ensure the discontinuity isn't an artifact of model specification
Placebo cutoffs at points where no treatment effect should exist help confirm that the real cutoff is special

Limitations and External Validity of RDD

Local Average Treatment Effect (LATE) means RDD only identifies effects at the cutoff—not for the broader population
Limited generalizability arises because people near thresholds may differ systematically from those far away
Single-cutoff dependence means the entire design rests on one discontinuity; if that cutoff is problematic, the whole analysis fails

Compare: RDD limitations vs. RCT limitations—RCTs offer broader internal validity but face external validity concerns about artificial settings. RDD has strong local validity but explicitly cannot speak to effects away from the cutoff. Know which limitation matters more for a given research question.

Applications and Comparisons: RDD in Context

Understanding where RDD fits in the causal inference toolkit helps you recognize when it's the right method—and when alternatives might be stronger.

Applications and Examples of RDD in Various Fields

Education research uses test score cutoffs for scholarships, gifted programs, or remediation (Angrist & Lavy's class size study is a classic)
Policy evaluation exploits eligibility thresholds for social programs—age cutoffs for Medicare, income limits for benefits
Health economics leverages age-based treatment rules or diagnostic thresholds to estimate intervention effects

Comparison of RDD with Other Causal Inference Methods

Unlike RCTs, RDD doesn't require randomization but achieves credibility through local quasi-randomization at the cutoff
Differs from matching methods because RDD exploits a known assignment mechanism rather than trying to balance observed covariates
Related to instrumental variables—fuzzy RDD is an IV design where the cutoff instruments for treatment; both identify LATE for compliers

Compare: RDD vs. Difference-in-Differences—RDD exploits a cross-sectional discontinuity in a running variable, while DiD exploits a temporal discontinuity (before/after treatment). RDD requires continuity assumptions; DiD requires parallel trends. Choose based on your data structure and which assumptions are more plausible.

Quick Reference Table

Concept	Best Examples
Design types	Sharp RDD, Fuzzy RDD
Core assumptions	Continuity, No manipulation, Running variable continuous
Validity tests	McCrary density test, Covariate balance checks, Placebo cutoffs
Estimation choices	Bandwidth selection, Local linear regression, Polynomial specifications
Robustness strategies	Sensitivity analysis, Multiple bandwidths, Alternative functional forms
Key limitations	Local effects only, External validity concerns, Single-cutoff dependence
Common applications	Education (test scores), Policy (eligibility thresholds), Health (age cutoffs)
Related methods	Instrumental variables (fuzzy RDD), RCTs (benchmark), Matching (alternative)

Self-Check Questions

What distinguishes sharp RDD from fuzzy RDD, and how does this distinction affect what parameter you're estimating?
A researcher finds that predetermined covariates show a discontinuity at the cutoff. What does this suggest about the validity of the RDD, and what assumption is likely violated?
Compare and contrast bandwidth selection in RDD with the bias-variance tradeoff: why might a researcher report results across multiple bandwidths rather than choosing a single "optimal" one?
If an FRQ asks you to evaluate a study using RDD to estimate the effect of a scholarship on college completion, what three validity checks would you look for in the research design?
Why does RDD estimate a Local Average Treatment Effect rather than an Average Treatment Effect, and what does this imply for generalizing findings to other populations?

📊Causal Inference

Key Concepts of Regression Discontinuity Designs

Why This Matters

The Core Logic: Exploiting Discontinuities

Definition and Basic Concept of RDD

Sharp vs. Fuzzy RDD

Validity Requirements: What Must Hold

Assumptions and Requirements for Valid RDD

Testing for Manipulation of the Running Variable

Estimation Choices: The Bias-Variance Tradeoff

Bandwidth Selection and Local Linear Regression

Graphical Analysis and Visualization in RDD

Robustness and Limitations: Stress-Testing Your Results

Sensitivity Analysis and Robustness Checks

Limitations and External Validity of RDD

Applications and Comparisons: RDD in Context

Applications and Examples of RDD in Various Fields

Comparison of RDD with Other Causal Inference Methods

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes