๐Ÿ“ŠCausal Inference

Sensitivity Analysis Methods

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

In causal inference, you're rarely working with perfect data. Unmeasured confounding lurks behind nearly every observational study, threatening to invalidate your conclusions. Sensitivity analysis methods let you ask a critical question: How wrong could I be? Rather than pretending confounders don't exist, these techniques quantify exactly how strong an unmeasured variable would need to be to overturn your findings. Knowing when your estimates are robust versus when they're hanging by a thread is central to defending causal claims under uncertainty.

These methods connect directly to core causal inference principles: the ignorability assumption, selection bias, instrumental variable validity, and mediation pathways. Each sensitivity analysis approach addresses a specific vulnerability in your causal argument. Don't just memorize what each method does. Understand which assumption it stress-tests and what kind of study design it applies to. When a question asks you to evaluate the credibility of a causal claim, your answer should include how you'd probe its weaknesses.


Quantifying Confounder Strength

These methods answer a direct question: How strong would an unmeasured confounder need to be to explain away my results? They translate abstract concerns about bias into concrete, interpretable quantities.

E-Value Method

The E-value quantifies the minimum strength of association that an unmeasured confounder would need to have with both the treatment and the outcome to reduce an observed effect to the null. It's calculated from the observed risk ratio (RR) on the risk ratio scale:

E=RR+RRร—(RRโˆ’1)E = RR + \sqrt{RR \times (RR - 1)}

This formula applies when RRโ‰ฅ1RR \geq 1. For a protective effect (RR<1RR < 1), you first take the inverse so you're working with a value above 1.

The E-value is particularly useful in policy discussions because it shifts the burden of proof: a critic must argue that a confounder with associations at least that strong plausibly exists. A large E-value means your finding is hard to explain away; a small one means even a weak confounder could do it.

Cornfield Conditions

Cornfield conditions establish a set of necessary inequalities that any confounder must satisfy to fully explain an observed association. Originally developed during the smoking-lung cancer debates of the 1950s, they focus on the confounder's prevalence and its strength of association with both treatment and outcome simultaneously.

Where the E-value gives you a single summary number, Cornfield conditions provide a logical framework. You can use them to systematically dismiss proposed confounders by showing they can't satisfy all the required inequalities at once. This makes them especially effective when a critic names a specific alternative explanation and you need to show it falls short.

Ding and VanderWeele's Bounding Factor

This approach extends E-value logic by providing tighter bounds when you have partial information about potential confounders. If you know something about the confounder-treatment relationship (say, from external data or a validation substudy), you can incorporate that knowledge to narrow the range of possible bias.

The method produces a bounding factor that directly multiplies your confidence interval, giving a concrete picture of how much additional uncertainty unmeasured confounding introduces. It bridges the gap between knowing nothing about confounders (where you'd use the E-value) and knowing everything (where you wouldn't need sensitivity analysis at all).

Compare: E-value vs. Cornfield conditions: both assess how strong a confounder would need to be, but E-values give a single interpretable number while Cornfield conditions provide a set of inequalities for logical argument. Use E-values for quick communication; use Cornfield logic when defending against specific proposed confounders.


Bounds and Thresholds for Matched Designs

When you've used matching or stratification to control observed confounders, these methods ask: What if I missed something? They're designed for studies where you've already balanced on observed covariates and need to assess vulnerability to unobserved ones.

Rosenbaum Bounds

Rosenbaum bounds parameterize hidden bias with a single quantity, ฮ“\Gamma, which represents the odds ratio of differential treatment assignment between matched units due to unobserved factors. When ฮ“=1\Gamma = 1, there's no hidden bias and treatment assignment within matched pairs is as-if random.

The procedure works by testing whether your finding remains statistically significant as you increase ฮ“\Gamma above 1:

  1. Start at ฮ“=1\Gamma = 1 (no hidden bias) and confirm your result is significant.
  2. Gradually increase ฮ“\Gamma, recalculating the p-value bounds at each step.
  3. Identify the breaking point: the value of ฮ“\Gamma where your result first becomes non-significant.

A study that remains significant up to ฮ“=3\Gamma = 3, for instance, is robust to a scenario where one matched unit is up to 3 times more likely to receive treatment than the other due to an unobserved factor. This is the standard sensitivity tool for matched observational studies because it directly targets the residual selection bias that matching cannot eliminate.

Tipping Point Analysis

Tipping point analysis identifies the exact threshold where an unmeasured confounder would flip your result from significant to null (or reverse its direction). Results are typically presented as a two-dimensional plot showing combinations of confounder prevalence and confounder-outcome effect size that would overturn your findings.

This visualization is intuitive: any confounder whose true prevalence and effect size fall below the tipping curve can't explain away your result. If the combinations needed to overturn your finding seem implausible based on subject-matter knowledge, your result is robust. Tipping point analysis also guides future research by showing exactly what kind of confounder would matter most to measure.

Compare: Rosenbaum bounds vs. Tipping point analysis: Rosenbaum bounds work within the matched-pairs framework using ฮ“\Gamma as a single sensitivity parameter, while tipping point analysis is more general and visualizes the full space of threatening confounders. For matched designs, lead with Rosenbaum bounds; for general robustness arguments, tipping point plots are more intuitive to a broad audience.


Design-Specific Sensitivity Methods

Different causal identification strategies have different vulnerabilities. These methods stress-test the specific assumptions that make each design work.

Instrumental Variable Sensitivity Analysis

IV estimation relies on the exclusion restriction: the instrument affects the outcome only through the treatment. IV sensitivity analysis probes what happens when this assumption is violated. Specifically, it quantifies how much direct effect the instrument would need to have on the outcome (bypassing treatment) to invalidate your IV estimate.

This is especially important for weak instruments, where even small violations of the exclusion restriction can produce large biases in the two-stage least squares estimator. If your instrument is only weakly correlated with treatment, a tiny direct path to the outcome gets amplified into a big problem.

Regression Discontinuity Sensitivity Analysis

RD designs assume that units just above and just below the cutoff are comparable. Sensitivity analysis for RD targets three main threats:

  • Bandwidth sensitivity: Results shouldn't change dramatically when you use slightly wider or narrower windows around the cutoff. If they do, your estimate may depend more on functional form assumptions than on the discontinuity itself.
  • Manipulation testing: Check for suspicious bunching of observations just above or below the threshold (e.g., using a McCrary density test). Bunching suggests people are gaming the cutoff, which violates the as-if-random assignment logic.
  • Covariate balance at the cutoff: Other covariates should not show jumps at the threshold. If pre-treatment characteristics are discontinuous at the cutoff, something other than the treatment is changing there too.

Mediation Sensitivity Analysis

Mediation analysis decomposes a total effect into direct and indirect pathways. The core vulnerability is unmeasured mediator-outcome confounding, and this is a problem that randomizing the treatment alone cannot solve. Even if treatment is randomly assigned, the mediator is not, so the relationship between the mediator and the outcome can still be confounded.

Mediation sensitivity analysis typically introduces a sensitivity parameter ฯ\rho representing the correlation between the error terms in the mediator and outcome models. You then check whether your estimated direct and indirect effects remain significant across plausible values of ฯ\rho. If the indirect effect disappears at small values of ฯ\rho, the mediation claim is fragile.

Compare: IV sensitivity vs. RD sensitivity: IV analysis worries about the exclusion restriction (does the instrument have direct effects on the outcome?), while RD analysis worries about continuity and manipulation (is the cutoff truly as-if random?). Both ask "what if my identifying assumption is slightly wrong?" but they target completely different assumptions.


Comprehensive Bias Modeling

Sometimes you face multiple potential biases simultaneously: selection bias, measurement error, unmeasured confounding. These methods handle that messy reality.

Multiple Bias Modeling

This approach simultaneously models several bias sources within a single analytical framework. You might account for confounding, selection bias, and outcome misclassification all at once. The key insight is that biases can compound or partially cancel each other. Two moderate biases might produce a larger net distortion than one strong bias alone, or they might push in opposite directions and partially offset.

The method requires you to specify bias parameters for each source, which makes your assumptions transparent and open to critique. That transparency is a feature, not a bug: it forces you to be explicit about what you think the biases look like.

Probabilistic Bias Analysis

Traditional sensitivity analysis often uses fixed, single-point values for bias parameters (e.g., "assume the confounder prevalence is 30%"). Probabilistic bias analysis replaces these point estimates with probability distributions, acknowledging that you don't know the exact magnitude of bias.

You specify plausible ranges for each bias parameter, drawing on prior knowledge, expert elicitation, or external data. The analysis then samples from these distributions (often via Monte Carlo simulation) and produces a distribution of corrected effect estimates. Instead of a single "what if" scenario, you get the full range of conclusions consistent with your uncertainty. This is more honest than picking a single bias value and hoping it's right.

Compare: Multiple bias modeling vs. Probabilistic bias analysis: multiple bias modeling handles several bias types but typically uses fixed parameter values, while probabilistic bias analysis embraces uncertainty by using distributions. The most thorough approach combines both: model multiple biases, each with its own probability distribution.


Quick Reference Table

ConceptBest Examples
Confounder strength quantificationE-value, Cornfield conditions, Bounding factor
Matched design sensitivityRosenbaum bounds, Tipping point analysis
Instrumental variable robustnessIV sensitivity analysis
Regression discontinuity validityRD sensitivity analysis (bandwidth, manipulation)
Mediation pathway robustnessMediation sensitivity analysis
Multiple simultaneous biasesMultiple bias modeling, Probabilistic bias analysis
Communicating robustness to non-expertsE-value, Tipping point analysis
Incorporating prior knowledgeProbabilistic bias analysis

Self-Check Questions

  1. You've conducted a matched observational study and found a significant treatment effect. Which sensitivity analysis method would you use to determine how much hidden bias could exist before your result becomes non-significant, and what parameter would you report?

  2. Compare the E-value method and Rosenbaum bounds: What type of study is each best suited for, and what does each method's output tell you about unmeasured confounding?

  3. A researcher using instrumental variables is worried that their instrument might have a small direct effect on the outcome. Which sensitivity analysis approach addresses this concern, and what assumption is being tested?

  4. How does probabilistic bias analysis differ from traditional sensitivity analysis that uses fixed bias parameters? When would you prefer one approach over the other?

  5. FRQ-style: You're reviewing a mediation analysis claiming that a job training program improves earnings primarily through increased self-efficacy. What specific unmeasured confounding threat does mediation sensitivity analysis address, and why can't randomization of treatment alone solve this problem?