📊Causal Inference

Key Concepts of Regression Discontinuity Designs

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Regression Discontinuity Design sits at the heart of modern causal inference because it solves a fundamental problem: how do we estimate causal effects when we can't randomly assign treatment? RDD exploits arbitrary cutoffs—test score thresholds, age limits, income eligibility lines—to create conditions that approximate a randomized experiment. You need to recognize when RDD is appropriate, understand what makes it credible, and identify threats to its validity.

These concepts connect directly to broader themes in causal inference: identification strategies, local versus average treatment effects, and the bias-variance tradeoff in estimation. Don't just memorize that "bandwidth matters" or "manipulation is bad." Understand why observations near a cutoff serve as valid counterfactuals and what assumptions must hold for that logic to work.


The Core Logic: Exploiting Discontinuities

RDD works because individuals just above and below a cutoff are essentially randomly assigned to treatment, at least locally. If the cutoff is arbitrary, people on either side should be comparable in all ways except their treatment status.

Definition and Basic Concept of RDD

RDD is a quasi-experimental design that exploits a cutoff in a continuous running variable (also called the forcing variable) to identify causal effects. Units on one side of the threshold receive treatment; units on the other side don't.

The design rests on a local randomization assumption: individuals just above and below the cutoff are statistically identical, differing only in whether they received treatment. This makes RDD one of the most credible observational methods available when its assumptions hold, because it identifies causal effects without actual randomization.

Sharp vs. Fuzzy RDD

Sharp RDD has deterministic assignment. Crossing the threshold guarantees treatment. For example, every student scoring 70\geq 70 on an exam receives a scholarship, and every student scoring <70< 70 does not. The probability of treatment jumps from 0 to 1 at the cutoff.

Fuzzy RDD involves probabilistic assignment. The cutoff changes the probability of treatment but doesn't determine it perfectly. Think of it as an intention-to-treat scenario: a student scoring 70\geq 70 becomes eligible for the scholarship but might not take it, or a student scoring <70< 70 might receive it through an appeal. The probability of treatment jumps at the cutoff, but not from 0 to 1.

Fuzzy designs require IV-style estimation, using the cutoff as an instrument for actual treatment receipt. This estimates a Local Average Treatment Effect (LATE) specifically for compliers—those whose treatment status actually changed because of the threshold.

Compare: Sharp RDD vs. Fuzzy RDD—both exploit the same cutoff logic, but sharp designs estimate treatment effects directly while fuzzy designs estimate effects only for those whose treatment status changed because of the threshold. If a problem describes imperfect compliance with a cutoff rule, you're dealing with fuzzy RDD.


Validity Requirements: What Must Hold

The credibility of any RDD hinges on whether the discontinuity in treatment is the only thing changing at the cutoff. These assumptions determine whether your causal claims are defensible.

Assumptions and Requirements for Valid RDD

  • Continuity assumption: potential outcomes must be smooth (continuous) functions of the running variable at the cutoff. Put differently, no other discontinuities exist at the cutoff besides the treatment assignment itself. If some other policy also kicks in at the same threshold, you can't isolate the effect of treatment.
  • No manipulation: individuals cannot precisely control their running variable to sort themselves above or below the threshold. If they can, the people just above and just below the cutoff are no longer comparable.
  • Continuous running variable: the running variable must take on many values (not just a few discrete categories), and treatment assignment must depend solely on its relationship to the cutoff.

Testing for Manipulation of the Running Variable

Two main diagnostic tools help you assess whether the design is valid:

The McCrary density test examines whether observations "bunch" suspiciously at the cutoff. You plot the density of the running variable and check for a discontinuous jump right at the threshold. A spike on one side suggests people are gaming the system to land above (or below) the cutoff.

Covariate balance checks verify that predetermined characteristics—things determined before the running variable was measured—don't jump at the threshold. If baseline covariates like gender, income, or prior test scores show a discontinuity, something other than treatment is changing at the cutoff.

If either test reveals problems, the design's credibility is seriously undermined, because it means people on either side of the cutoff are systematically different in ways unrelated to treatment.

Compare: Manipulation testing vs. Covariate balance—both assess validity but target different threats. Density tests catch sorting into treatment; covariate checks catch discontinuities in baseline characteristics. Strong RDD papers report both.


Estimation Choices: The Bias-Variance Tradeoff

How you estimate the treatment effect matters enormously. The core tension: using more data reduces variance but risks bias from observations far from the cutoff.

Bandwidth Selection and Local Linear Regression

Bandwidth defines the analysis window around the cutoff. Only observations within this range contribute to estimation. Choosing the right bandwidth involves a direct tradeoff:

  • Narrower bandwidths reduce bias by focusing on observations most similar to each other, but they increase variance because fewer observations means noisier estimates.
  • Wider bandwidths increase precision (lower variance) but risk bias if the relationship between the running variable and the outcome isn't linear far from the cutoff.

Optimal bandwidth selection methods try to balance this tradeoff systematically. The most commonly used are the Imbens-Kalyanaraman (IK) procedure and the Calonico-Cattaneo-Titiunik (CCT) method, which also provides bias-corrected confidence intervals.

Within the chosen bandwidth, local linear regression is the standard estimation approach. You fit separate linear regressions on each side of the cutoff, and the treatment effect estimate is the difference in predicted outcomes right at the threshold.

Graphical Analysis and Visualization in RDD

Visual evidence is central to RDD. Three elements matter:

  • Binned scatter plots show average outcomes at different values of the running variable. A visible "jump" at the cutoff signals a treatment effect. These plots also reveal the shape of the underlying relationship on each side.
  • Fitted regression lines on either side of the cutoff help visualize the estimated discontinuity and make the functional form assumptions transparent.
  • Graphs serve as transparency devices, allowing readers to assess whether the effect is real or an artifact of modeling choices. If the jump isn't visible in the raw data, a statistically significant estimate should be treated with skepticism.

Compare: Wide bandwidth vs. Narrow bandwidth—wider bandwidths give you more statistical power but assume the relationship between the running variable and outcome is correctly specified far from the cutoff. Narrow bandwidths are more credible but noisier. Always report results across multiple bandwidths.


Robustness and Limitations: Stress-Testing Your Results

No single RDD estimate should be taken at face value. Credible research demonstrates that findings survive alternative specifications and acknowledges inherent limitations.

Sensitivity Analysis and Robustness Checks

Three standard robustness checks strengthen an RDD analysis:

  1. Vary the bandwidth to show results aren't driven by a single arbitrary choice. Effects should be stable across a reasonable range of bandwidths (e.g., 50%, 75%, 100%, 125%, and 150% of the optimal bandwidth).
  2. Test different functional forms (linear, quadratic, higher-order polynomials) to ensure the discontinuity isn't an artifact of model specification. That said, Gelman and Imbens (2019) caution against high-order global polynomials because they can produce misleading results.
  3. Run placebo cutoffs at points where no treatment effect should exist. If you find "effects" at fake thresholds, it suggests the real estimate may also be spurious.

Limitations and External Validity of RDD

  • Local Average Treatment Effect (LATE): RDD only identifies effects at the cutoff, not for the broader population. A scholarship's effect on students scoring 70 may differ substantially from its effect on students scoring 90.
  • Limited generalizability: people near thresholds may differ systematically from those far away. Extrapolating RDD results to other parts of the running variable distribution requires strong additional assumptions.
  • Single-cutoff dependence: the entire design rests on one discontinuity. If that cutoff is problematic (e.g., it coincides with another policy change), the whole analysis fails.

Compare: RDD limitations vs. RCT limitations—RCTs offer broader internal validity but face external validity concerns about artificial settings. RDD has strong local validity but explicitly cannot speak to effects away from the cutoff. Know which limitation matters more for a given research question.


Applications and Comparisons: RDD in Context

Understanding where RDD fits in the causal inference toolkit helps you recognize when it's the right method and when alternatives might be stronger.

Applications and Examples of RDD in Various Fields

  • Education research uses test score cutoffs for scholarships, gifted programs, or remediation. Angrist and Lavy's (1999) study of class size effects in Israel, which exploited Maimonides' Rule (a maximum of 40 students per class), is a classic example.
  • Policy evaluation exploits eligibility thresholds for social programs. Age cutoffs for Medicare eligibility at 65, income limits for Medicaid or welfare benefits, and vote share thresholds for incumbency effects (Lee, 2008) are all common applications.
  • Health economics leverages age-based treatment rules or diagnostic thresholds (e.g., BMI cutoffs for obesity interventions, birth weight thresholds for neonatal care) to estimate intervention effects.

Comparison of RDD with Other Causal Inference Methods

  • Unlike RCTs, RDD doesn't require randomization but achieves credibility through local quasi-randomization at the cutoff. RCTs remain the benchmark for internal validity, but RDD can be applied in settings where randomization is infeasible or unethical.
  • Differs from matching methods because RDD exploits a known, rule-based assignment mechanism rather than trying to balance observed covariates after the fact. This makes RDD's identification strategy more transparent.
  • Related to instrumental variables: fuzzy RDD is an IV design where the cutoff instruments for treatment receipt. Both identify LATE for compliers, and both require a first-stage relationship between the instrument and treatment.

Compare: RDD vs. Difference-in-Differences—RDD exploits a cross-sectional discontinuity in a running variable, while DiD exploits a temporal discontinuity (before/after treatment). RDD requires continuity assumptions; DiD requires parallel trends. Choose based on your data structure and which assumptions are more plausible.


Quick Reference Table

ConceptKey Details
Design typesSharp RDD (deterministic), Fuzzy RDD (probabilistic, uses IV)
Core assumptionsContinuity of potential outcomes, No manipulation, Continuous running variable
Validity testsMcCrary density test, Covariate balance checks, Placebo cutoffs
Estimation choicesBandwidth selection (IK, CCT), Local linear regression, Polynomial specifications
Robustness strategiesMultiple bandwidths, Alternative functional forms, Placebo cutoffs
Key limitationsLocal effects only (LATE), Limited external validity, Single-cutoff dependence
Common applicationsEducation (test scores), Policy (eligibility thresholds), Health (age/diagnostic cutoffs)
Related methodsInstrumental variables (fuzzy RDD), RCTs (benchmark), DiD (temporal alternative)

Self-Check Questions

  1. What distinguishes sharp RDD from fuzzy RDD, and how does this distinction affect what parameter you're estimating?

  2. A researcher finds that predetermined covariates show a discontinuity at the cutoff. What does this suggest about the validity of the RDD, and what assumption is likely violated?

  3. Compare and contrast bandwidth selection in RDD with the bias-variance tradeoff: why might a researcher report results across multiple bandwidths rather than choosing a single "optimal" one?

  4. If you're asked to evaluate a study using RDD to estimate the effect of a scholarship on college completion, what three validity checks would you look for in the research design?

  5. Why does RDD estimate a Local Average Treatment Effect rather than an Average Treatment Effect, and what does this imply for generalizing findings to other populations?