Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Regression Discontinuity Design sits at the heart of modern causal inference because it solves a fundamental problem: how do we estimate causal effects when we can't randomly assign treatment? RDD exploits arbitrary cutoffs—test score thresholds, age limits, income eligibility lines—to create conditions that approximate a randomized experiment. You need to recognize when RDD is appropriate, understand what makes it credible, and identify threats to its validity.
These concepts connect directly to broader themes in causal inference: identification strategies, local versus average treatment effects, and the bias-variance tradeoff in estimation. Don't just memorize that "bandwidth matters" or "manipulation is bad." Understand why observations near a cutoff serve as valid counterfactuals and what assumptions must hold for that logic to work.
RDD works because individuals just above and below a cutoff are essentially randomly assigned to treatment, at least locally. If the cutoff is arbitrary, people on either side should be comparable in all ways except their treatment status.
RDD is a quasi-experimental design that exploits a cutoff in a continuous running variable (also called the forcing variable) to identify causal effects. Units on one side of the threshold receive treatment; units on the other side don't.
The design rests on a local randomization assumption: individuals just above and below the cutoff are statistically identical, differing only in whether they received treatment. This makes RDD one of the most credible observational methods available when its assumptions hold, because it identifies causal effects without actual randomization.
Sharp RDD has deterministic assignment. Crossing the threshold guarantees treatment. For example, every student scoring on an exam receives a scholarship, and every student scoring does not. The probability of treatment jumps from 0 to 1 at the cutoff.
Fuzzy RDD involves probabilistic assignment. The cutoff changes the probability of treatment but doesn't determine it perfectly. Think of it as an intention-to-treat scenario: a student scoring becomes eligible for the scholarship but might not take it, or a student scoring might receive it through an appeal. The probability of treatment jumps at the cutoff, but not from 0 to 1.
Fuzzy designs require IV-style estimation, using the cutoff as an instrument for actual treatment receipt. This estimates a Local Average Treatment Effect (LATE) specifically for compliers—those whose treatment status actually changed because of the threshold.
Compare: Sharp RDD vs. Fuzzy RDD—both exploit the same cutoff logic, but sharp designs estimate treatment effects directly while fuzzy designs estimate effects only for those whose treatment status changed because of the threshold. If a problem describes imperfect compliance with a cutoff rule, you're dealing with fuzzy RDD.
The credibility of any RDD hinges on whether the discontinuity in treatment is the only thing changing at the cutoff. These assumptions determine whether your causal claims are defensible.
Two main diagnostic tools help you assess whether the design is valid:
The McCrary density test examines whether observations "bunch" suspiciously at the cutoff. You plot the density of the running variable and check for a discontinuous jump right at the threshold. A spike on one side suggests people are gaming the system to land above (or below) the cutoff.
Covariate balance checks verify that predetermined characteristics—things determined before the running variable was measured—don't jump at the threshold. If baseline covariates like gender, income, or prior test scores show a discontinuity, something other than treatment is changing at the cutoff.
If either test reveals problems, the design's credibility is seriously undermined, because it means people on either side of the cutoff are systematically different in ways unrelated to treatment.
Compare: Manipulation testing vs. Covariate balance—both assess validity but target different threats. Density tests catch sorting into treatment; covariate checks catch discontinuities in baseline characteristics. Strong RDD papers report both.
How you estimate the treatment effect matters enormously. The core tension: using more data reduces variance but risks bias from observations far from the cutoff.
Bandwidth defines the analysis window around the cutoff. Only observations within this range contribute to estimation. Choosing the right bandwidth involves a direct tradeoff:
Optimal bandwidth selection methods try to balance this tradeoff systematically. The most commonly used are the Imbens-Kalyanaraman (IK) procedure and the Calonico-Cattaneo-Titiunik (CCT) method, which also provides bias-corrected confidence intervals.
Within the chosen bandwidth, local linear regression is the standard estimation approach. You fit separate linear regressions on each side of the cutoff, and the treatment effect estimate is the difference in predicted outcomes right at the threshold.
Visual evidence is central to RDD. Three elements matter:
Compare: Wide bandwidth vs. Narrow bandwidth—wider bandwidths give you more statistical power but assume the relationship between the running variable and outcome is correctly specified far from the cutoff. Narrow bandwidths are more credible but noisier. Always report results across multiple bandwidths.
No single RDD estimate should be taken at face value. Credible research demonstrates that findings survive alternative specifications and acknowledges inherent limitations.
Three standard robustness checks strengthen an RDD analysis:
Compare: RDD limitations vs. RCT limitations—RCTs offer broader internal validity but face external validity concerns about artificial settings. RDD has strong local validity but explicitly cannot speak to effects away from the cutoff. Know which limitation matters more for a given research question.
Understanding where RDD fits in the causal inference toolkit helps you recognize when it's the right method and when alternatives might be stronger.
Compare: RDD vs. Difference-in-Differences—RDD exploits a cross-sectional discontinuity in a running variable, while DiD exploits a temporal discontinuity (before/after treatment). RDD requires continuity assumptions; DiD requires parallel trends. Choose based on your data structure and which assumptions are more plausible.
| Concept | Key Details |
|---|---|
| Design types | Sharp RDD (deterministic), Fuzzy RDD (probabilistic, uses IV) |
| Core assumptions | Continuity of potential outcomes, No manipulation, Continuous running variable |
| Validity tests | McCrary density test, Covariate balance checks, Placebo cutoffs |
| Estimation choices | Bandwidth selection (IK, CCT), Local linear regression, Polynomial specifications |
| Robustness strategies | Multiple bandwidths, Alternative functional forms, Placebo cutoffs |
| Key limitations | Local effects only (LATE), Limited external validity, Single-cutoff dependence |
| Common applications | Education (test scores), Policy (eligibility thresholds), Health (age/diagnostic cutoffs) |
| Related methods | Instrumental variables (fuzzy RDD), RCTs (benchmark), DiD (temporal alternative) |
What distinguishes sharp RDD from fuzzy RDD, and how does this distinction affect what parameter you're estimating?
A researcher finds that predetermined covariates show a discontinuity at the cutoff. What does this suggest about the validity of the RDD, and what assumption is likely violated?
Compare and contrast bandwidth selection in RDD with the bias-variance tradeoff: why might a researcher report results across multiple bandwidths rather than choosing a single "optimal" one?
If you're asked to evaluate a study using RDD to estimate the effect of a scholarship on college completion, what three validity checks would you look for in the research design?
Why does RDD estimate a Local Average Treatment Effect rather than an Average Treatment Effect, and what does this imply for generalizing findings to other populations?