📈Applied Impact Evaluation Unit 2 – Causal Inference & Counterfactuals
Causal inference and counterfactuals are crucial concepts in impact evaluation. They help researchers determine cause-and-effect relationships between interventions and outcomes. By comparing potential outcomes under different treatment conditions, we can estimate the true impact of a program or policy.
Understanding these concepts is essential for designing robust studies and interpreting results accurately. Key challenges include addressing selection bias, confounding variables, and the fundamental problem of causal inference. Various research designs and statistical methods aim to overcome these challenges and establish causal relationships.
Causal inference involves determining the cause-and-effect relationship between variables, treatments, or interventions and outcomes
Counterfactuals are hypothetical scenarios that describe what would have happened in the absence of a treatment or intervention
Potential outcomes framework is a conceptual approach to causal inference that compares the potential outcomes of a unit under different treatment conditions
Treatment effect is the difference between the potential outcomes of a unit under treatment and control conditions
Selection bias occurs when the treatment and control groups differ systematically in ways that affect the outcome, leading to biased estimates of the treatment effect
Confounding variables are factors that influence both the treatment assignment and the outcome, potentially distorting the causal relationship
Internal validity refers to the extent to which a study can establish a causal relationship between the treatment and the outcome within the study sample
External validity concerns the generalizability of the study findings to other populations, settings, or contexts
Causal Inference Basics
Causal inference aims to establish a cause-and-effect relationship between a treatment or intervention and an outcome of interest
Randomized controlled trials (RCTs) are considered the gold standard for causal inference because they ensure that the treatment assignment is independent of potential outcomes
Observational studies, in contrast, rely on non-experimental data and require careful design and analysis to address potential confounding factors
The fundamental problem of causal inference is that we can only observe one potential outcome for each unit, either under treatment or control, but never both simultaneously
Causal effects are defined as the difference between the potential outcomes of a unit under different treatment conditions
Average treatment effect (ATE) is the average difference in potential outcomes across the entire population
Average treatment effect on the treated (ATT) is the average difference in potential outcomes among those who actually received the treatment
Stable unit treatment value assumption (SUTVA) states that the potential outcomes of one unit should not be affected by the treatment assignment of other units and that there are no hidden variations of the treatment
Counterfactual Framework
The counterfactual framework is a conceptual approach to causal inference that relies on the notion of potential outcomes
Potential outcomes are the hypothetical outcomes that a unit would experience under different treatment conditions, regardless of the actual treatment received
The observed outcome for a unit is the potential outcome corresponding to the treatment condition the unit actually received
The unobserved counterfactual outcome is the potential outcome that would have been realized had the unit received a different treatment condition
The individual treatment effect is the difference between a unit's potential outcomes under treatment and control conditions
The average treatment effect (ATE) can be estimated by comparing the average outcomes of the treatment and control groups, provided that the treatment assignment is independent of potential outcomes
The counterfactual framework allows for the definition and estimation of causal effects, even when the treatment assignment is not randomized
The key identifying assumption in the counterfactual framework is the unconfoundedness assumption, which states that the treatment assignment is independent of potential outcomes, conditional on observed covariates
Research Design Strategies
Randomization is a powerful research design strategy that ensures the treatment assignment is independent of potential outcomes, thereby eliminating selection bias
Stratified randomization involves dividing the study population into subgroups (strata) based on important characteristics and then randomly assigning units within each stratum to treatment and control conditions
Cluster randomization assigns entire clusters (e.g., schools, communities) to treatment and control conditions, rather than individual units
Matched pair designs involve pairing units with similar characteristics and then randomly assigning one unit in each pair to the treatment group and the other to the control group
Regression discontinuity designs (RDDs) exploit a cutoff point in a continuous variable (e.g., test score) that determines treatment assignment, comparing units just above and below the cutoff
Difference-in-differences (DID) designs compare the change in outcomes over time between a treatment group and a control group, assuming that the two groups would have followed parallel trends in the absence of the treatment
Instrumental variables (IV) designs use an external variable (the instrument) that affects the treatment assignment but not the outcome directly to estimate the causal effect of the treatment on the outcome
Propensity score matching (PSM) involves estimating the probability of treatment assignment based on observed covariates and then matching treated and control units with similar propensity scores to balance the two groups
Statistical Methods
Statistical methods for causal inference aim to estimate the causal effect of a treatment on an outcome while addressing potential confounding factors
Regression analysis is a common statistical method that estimates the relationship between the treatment and the outcome, controlling for observed covariates
Propensity score methods, such as matching, stratification, and weighting, aim to balance the distribution of observed covariates between the treatment and control groups
Inverse probability weighting (IPW) assigns weights to units based on their propensity scores, giving more weight to units with a lower probability of receiving their actual treatment assignment
Doubly robust estimation combines regression adjustment and propensity score weighting to estimate causal effects, providing unbiased estimates if either the outcome model or the propensity score model is correctly specified
Instrumental variables estimation uses two-stage least squares (2SLS) or other IV methods to estimate the causal effect of the treatment on the outcome, exploiting the variation in the treatment induced by the instrument
Mediation analysis aims to decompose the total effect of a treatment on an outcome into direct and indirect effects, with the indirect effect operating through a mediator variable
Sensitivity analysis assesses the robustness of the causal estimates to potential unobserved confounding factors by simulating the impact of hypothetical confounders on the results
Challenges and Limitations
Unobserved confounding is a major challenge in causal inference, as it can lead to biased estimates of the treatment effect if not adequately addressed
Spillover effects occur when the treatment of one unit affects the outcomes of other units, violating the stable unit treatment value assumption (SUTVA)
Attrition bias arises when participants drop out of a study non-randomly, potentially leading to biased estimates if the attrition is related to both the treatment and the outcome
Measurement error in the treatment, outcome, or covariates can lead to biased estimates of the causal effect and reduced statistical power
External validity is a concern when generalizing the findings of a study to other populations, settings, or contexts, as the causal effect may vary across different subgroups or environments
Heterogeneous treatment effects occur when the causal effect of the treatment varies across different subgroups of the population, requiring careful analysis and interpretation
Multiple hypothesis testing can lead to an increased risk of Type I errors (false positives) when conducting many statistical tests simultaneously, requiring appropriate adjustments to maintain the desired significance level
Publication bias can distort the evidence base if studies with statistically significant or positive results are more likely to be published than those with null or negative findings
Real-World Applications
Impact evaluation assesses the causal effects of policies, programs, or interventions on various outcomes, such as health, education, or economic well-being
Randomized controlled trials have been used to evaluate the effectiveness of microfinance programs, conditional cash transfers, and health interventions in developing countries
Observational studies have been employed to estimate the causal effects of environmental exposures, such as air pollution or pesticides, on health outcomes
Regression discontinuity designs have been applied to evaluate the impact of educational policies, such as school entry age or merit-based scholarships, on student outcomes
Difference-in-differences methods have been used to assess the impact of policy changes, such as minimum wage increases or smoking bans, on labor market outcomes or public health
Instrumental variables approaches have been employed to estimate the causal effect of education on earnings, using compulsory schooling laws or distance to school as instruments
Propensity score methods have been applied to evaluate the effectiveness of job training programs, comparing the outcomes of participants and non-participants with similar propensity scores
Causal inference methods have been used in personalized medicine to estimate the individual treatment effects of different therapies based on patient characteristics
Further Reading and Resources
"Causal Inference in Statistics: A Primer" by Judea Pearl, Madelyn Glymour, and Nicholas P. Jewell provides an accessible introduction to the concepts and methods of causal inference
"Mostly Harmless Econometrics: An Empiricist's Companion" by Joshua D. Angrist and Jörn-Steffen Pischke covers a range of econometric methods for causal inference, with a focus on practical applications
"Counterfactuals and Causal Inference: Methods and Principles for Social Research" by Stephen L. Morgan and Christopher Winship presents a comprehensive treatment of the counterfactual framework and its applications in social science research
"The Effect: An Introduction to Research Design and Causality" by Nick Huntington-Klein offers a hands-on guide to designing and analyzing studies for causal inference, with examples in R and Stata
The Journal of Causal Inference publishes research articles, methodological advances, and review papers related to causal inference in various fields, including statistics, economics, epidemiology, and social sciences
The Causal Inference Bootcamp is an online learning resource that provides video lectures, coding examples, and exercises on causal inference methods, covering topics such as potential outcomes, directed acyclic graphs, and instrumental variables
The Causal Analysis in Theory and Practice (CATP) seminar series features presentations and discussions by leading researchers on topics related to causal inference, with recordings available online
The Causal Inference Reading Group is a community-driven resource that curates and discusses recent research papers on causal inference, with a focus on methodological innovations and applications in different domains