Defining interventions
An intervention means you physically set a variable to a particular value, rather than just watching what value it naturally takes. This is the core move that separates causal inference from standard statistics: instead of asking "What is when we observe ?" you ask "What would be if we forced to equal ?"
That distinction matters because observational associations can be driven by confounders, selection effects, or reverse causation. Interventions cut through all of that by breaking the variable free from its usual causes.
Interventional distributions
An interventional distribution, written , describes the probability distribution of in a world where has been set to by external action. Compare this to the observational conditional , which reflects the distribution of among units that happen to have .
These two quantities can differ dramatically. For example, in hospital records might be low because sicker patients receive the drug more often. But captures what would happen if you randomly assigned the drug, removing that selection effect.
Causal vs observational
- A causal relationship means changing directly changes . An observational association means and tend to co-occur, but manipulating might not affect at all.
- Confounders are the classic reason these diverge: ice cream sales and drowning rates are correlated (both driven by hot weather), but banning ice cream won't prevent drownings.
- The whole point of the do-operator is to formalize this distinction so you can reason precisely about when observational data can tell you about causal effects, and when it can't.
Ideal randomized experiments
Randomized experiments are the gold standard for estimating causal effects because randomization breaks the link between the treatment and any confounders.
- Randomly assigning subjects to treatment vs. control ensures that, in expectation, the groups are identical on all pre-treatment characteristics.
- This means in the experimental data, so the observational conditional is the causal quantity.
- Classic examples: clinical trials for drug efficacy, A/B tests for website design changes.
The challenge is that randomized experiments aren't always feasible. That's where the rest of this unit comes in.
Representing interventions
To work with interventions formally, you need notation and graphical tools that make causal assumptions explicit and manipulable.
Intervention notation
The do-operator, written , represents an external intervention that forces to take value . The key expressions to know:
- is the interventional distribution of when is set to .
- is the ordinary observational conditional.
- The central question of causal identification: Can we express purely in terms of observational distributions?
Graphical representation
Causal graphs (specifically DAGs) encode your causal assumptions visually:
- Nodes represent variables.
- Directed edges (arrows) represent direct causal effects.
- The graph tells you which variables cause which, and which paths connect treatment to outcome.
When you intervene on , you represent this by deleting all incoming edges to in the graph. This reflects the fact that the intervention overrides whatever would normally determine .
Manipulated graphs
A manipulated graph (also called a mutilated graph) is the DAG you get after deleting incoming edges to the intervened variable. Formally, if you perform , the manipulated graph is the original graph with all arrows pointing into removed.
Why this matters: the manipulated graph encodes the independence structure of the post-intervention world. You use it to read off which variables are independent of which under the intervention, and that's what drives identification criteria like the back-door and front-door criteria.
Identifiability under interventions
Identifiability asks: given the causal graph and observational data, can you compute the interventional distribution ? If yes, the causal effect is identifiable. If not, you need additional data or stronger assumptions.
Back-door criterion
The back-door criterion gives you a sufficient condition for identifying a causal effect by adjusting for confounders.
A set of variables satisfies the back-door criterion relative to if:
- No variable in is a descendant of .
- blocks every path between and that has an arrow into (these are the "back-door paths").
When satisfies the criterion, the causal effect is given by the back-door adjustment formula:
Example: to estimate the causal effect of smoking on cancer, you might adjust for age and socioeconomic status if those are confounders with no descendant relationship to smoking.

Front-door criterion
The front-door criterion handles cases where there are unmeasured confounders between and , but there exists a mediator such that:
- blocks all back-door paths from to .
- captures the entire causal effect of on (all directed paths from to go through ).
- There are no unblocked back-door paths from to after conditioning on .
When these conditions hold, the causal effect is:
This is powerful because it lets you identify causal effects even when you can't measure the confounder directly.
Instrumental variables
An instrumental variable (IV) affects the treatment but has no direct effect on the outcome , and is independent of unmeasured confounders between and .
- The classic example: using proximity to a college as an instrument for years of education when estimating education's effect on income. Living near a college nudges people toward more education, but doesn't directly affect income through other channels (that's the assumption, at least).
- IVs allow causal effect estimation under unmeasured confounding, but they typically identify a local average treatment effect (LATE) rather than the average treatment effect for the whole population.
- The key assumptions (relevance, exclusion restriction, independence) must hold for IV estimates to be valid, and they're not testable from data alone.
Intro to do-calculus
Do-calculus, developed by Judea Pearl, is a complete set of rules for transforming expressions involving the do-operator. Its purpose is to convert interventional quantities into expressions that involve only observational distributions, which you can estimate from data.
Formal definition
Do-calculus consists of three inference rules that let you add, remove, or exchange observations and interventions within probability expressions. Each rule has a graphical condition that must be satisfied in a specific manipulated version of the DAG. Together, these rules are complete: if a causal effect is identifiable from the graph, some sequence of these three rules will derive the identifying formula.
Differences from standard probability
Standard probability gives you tools like Bayes' rule and the chain rule for manipulating conditional distributions. But it has no way to distinguish from , because standard probability doesn't encode causal direction.
Do-calculus adds this layer. It operates on expressions that mix observational conditioning and do-operators, and its rules are justified by the causal structure (the DAG), not just by probability axioms. This is what makes it possible to answer causal questions from observational data.
Three rules of do-calculus
Each rule applies when a specific conditional independence holds in a specific manipulated graph. Let , , , and be disjoint sets of variables in a DAG :
Rule 1 (Insertion/deletion of observations):
if in the graph (the graph with incoming edges to removed).
This says you can ignore an observed variable if it's independent of given and in the manipulated graph.
Rule 2 (Action/observation exchange):
if in the graph (incoming edges to removed, outgoing edges from removed).
This lets you replace an intervention on with an observation of under the right conditions.
Rule 3 (Insertion/deletion of actions):
if in the graph , where is the set of nodes in that are not ancestors of any node in in .
This lets you drop an intervention on entirely when the graphical condition is met.
Applying do-calculus
The practical use of do-calculus is to take a target causal quantity like and, through repeated application of the three rules, reduce it to an expression involving only observational probabilities.
Identifying causal effects
The typical workflow:
- Write down the target interventional distribution, e.g., .
- Examine the DAG to determine which rules apply.
- Apply rules to progressively eliminate do-operators.
- If you can reduce the expression to one with no do-operators, the effect is identified, and you have a formula you can estimate from data.
Both the back-door and front-door adjustment formulas can be derived as special cases of do-calculus. The power of do-calculus is that it also handles cases where neither criterion directly applies.
Proving non-identifiability
Sometimes no sequence of do-calculus rules can eliminate all do-operators. In that case, the causal effect is not identifiable from observational data alone given the assumed graph.
- This is a genuine impossibility result, not just a failure to find the right trick. Completeness of do-calculus guarantees that if the rules can't get you there, no other method based on the same assumptions can either.
- Non-identifiability tells you that you need either additional data (e.g., from experiments or new measured variables) or stronger assumptions to estimate the effect.

Worked examples
A concrete example helps clarify the process. Consider the classic front-door model:
- is an unmeasured confounder affecting both and .
- , with and .
To identify :
- Apply Rule 2 to replace with conditioning on for the relationship (since doesn't confound once you account for the graph structure).
- Apply Rule 2 and Rule 3 to handle the relationship, adjusting for to block the back-door path through .
- Combine the pieces to arrive at the front-door formula.
Working through examples like this is the best way to build fluency with do-calculus. Try deriving the back-door adjustment formula from the three rules as practice.
Interventions with SCMs
Structural Causal Models (SCMs) give you a complete formal package: a DAG for the qualitative causal structure, plus a set of structural equations that specify the quantitative relationships.
Modeling interventions
In an SCM, each variable has a structural equation:
where are the parents of in the DAG and is an exogenous noise term.
To model an intervention , you replace the equation for with the constant and leave all other equations unchanged. This is the formal counterpart of deleting incoming edges in the graph.
Interventions can also be stochastic: instead of setting to a fixed value, you draw it from some specified distribution, replacing the original equation with a new random mechanism.
Truncated factorization
In an SCM without interventions, the joint distribution factors as:
Under an intervention , the factor for is removed (since is now fixed), giving the truncated factorization:
This formula is the algebraic expression of graph mutilation. It directly connects the graphical operation (delete incoming edges) to a probability computation.
Causal effect estimation
With the truncated factorization in hand, you can compute any interventional quantity by summing or integrating over the non-intervened variables.
Common estimands:
- Average Treatment Effect (ATE):
- Effect of Treatment on the Treated (ETT):
SCMs also support counterfactual reasoning (e.g., "What would have happened to this specific individual under a different treatment?"), which goes beyond what do-calculus alone can handle. Counterfactuals require the full SCM, including the noise terms , not just the graph.
Practical considerations
The formal machinery of do-calculus and SCMs is powerful, but applying it in practice requires careful attention to real-world constraints.
Feasibility of interventions
Many interventions that are well-defined mathematically are impossible or impractical to carry out:
- You can write , but you can't actually randomize gender.
- Cost, logistics, and time constraints may rule out large-scale experiments.
- When direct intervention isn't feasible, researchers turn to natural experiments (where some external event approximates random assignment) or quasi-experimental designs (difference-in-differences, regression discontinuity).
Ethical implications
Even when an intervention is feasible, it may not be ethical:
- Withholding a potentially beneficial treatment from a control group raises concerns.
- Interventions on vulnerable populations require extra scrutiny.
- Institutional review boards (IRBs) enforce standards around informed consent, minimizing harm, and equitable selection of subjects.
These constraints are a major reason why observational causal inference methods exist in the first place.
Limitations of do-calculus
Do-calculus is complete for the class of problems it addresses, but it rests on assumptions that may not hold:
- Causal Markov condition: Each variable is independent of its non-descendants given its parents. Violations can occur with feedback loops or aggregated variables.
- Faithfulness: All observed independencies are consequences of the graph structure, not coincidental cancellations. This can fail in finely tuned systems.
- Correct graph specification: Do-calculus takes the DAG as given. If your graph is wrong (missing edges, wrong directions), your conclusions will be wrong too.
When you're uncertain about these assumptions, complement do-calculus with sensitivity analysis (how much would an unmeasured confounder need to shift results?) and partial identification (deriving bounds on the causal effect rather than a point estimate).