Figuring out whether an exposure actually causes a disease is one of the hardest problems in epidemiology. It's also one of the most important, because public health interventions only work if you're targeting a true cause. This section covers how epidemiologists think about causation, the criteria they use to evaluate it, and why it's so difficult to prove with observational data alone.

Concept of Causality in Epidemiology

A causal relationship exists when one factor directly influences the occurrence of another. Establishing causality is the foundation for disease prevention: if smoking causes lung cancer, then reducing smoking rates should reduce lung cancer rates. Without causal evidence, public health programs are just guessing at which risk factors to target.

Three core components underpin causal thinking:

Temporal sequence: The exposure must come before the disease. This sounds obvious, but it's surprisingly hard to confirm in certain study designs.
Strength of association: Measured by relative risk or odds ratio. A stronger association (e.g., RR of 10 for smoking and lung cancer) makes a causal explanation more plausible than a weak one.
Dose-response relationship: As exposure increases, risk increases. For example, heavier smokers have higher lung cancer rates than lighter smokers.

Establishing causality gets complicated for several reasons. Most diseases are multifactorial, meaning many exposures contribute (cardiovascular disease involves diet, genetics, stress, smoking, and more). Confounding factors like socioeconomic status can create the illusion of a causal link where none exists. And sometimes there's a long time lag between exposure and outcome, as with asbestos exposure and mesothelioma, which can take 20–50 years to develop.

Hill's Criteria for Causal Relationships

In 1965, Sir Austin Bradford Hill proposed nine criteria to help evaluate whether an observed association is likely causal. These aren't a rigid checklist where every box must be checked. Instead, they're a framework: the more criteria that are satisfied, the stronger the case for causality.

Strength of association — A large relative risk or odds ratio makes confounding a less likely explanation. The association between smoking and lung cancer (RR ≈ 10–20 in heavy smokers) is a classic example. Generally, an RR greater than 2 is considered strong, though even smaller values can reflect real causal effects.
Consistency — The same finding shows up across different populations, time periods, and study designs. Smoking and lung cancer has been replicated in dozens of studies worldwide, which makes a causal interpretation much more convincing.
Specificity — One cause leads to one specific effect. This is actually the weakest of Hill's criteria because most exposures cause multiple outcomes and most diseases have multiple causes. HPV and cervical cancer is one of the closer examples, but even HPV causes other cancers too.
Temporality — The exposure must precede the outcome. This is the only criterion considered absolutely necessary. If you can't show that lead exposure came before cognitive deficits, you can't argue lead caused them.
Biological gradient (dose-response) — More exposure leads to more disease. Higher alcohol consumption is associated with greater risk of liver cirrhosis. A clear dose-response curve strengthens the causal argument considerably.
Plausibility — There's a biological mechanism that explains how the exposure could cause the disease. UV radiation damaging DNA in skin cells, leading to skin cancer, is biologically plausible. However, plausibility depends on current scientific knowledge, which changes over time.
Coherence — The causal interpretation doesn't conflict with what's already known about the disease's natural history and biology. The link between air pollution and respiratory diseases is coherent with what we know about how particulate matter damages lung tissue.
Experiment — Experimental evidence (like an intervention that removes the exposure and reduces disease) provides strong support. Dietary salt reduction trials that lower blood pressure are an example.
Analogy — If a similar exposure is known to cause a similar effect, that supports the new causal claim. When thalidomide was found to cause birth defects, it became easier to suspect other drugs might do the same.

The key takeaway: no single criterion proves causation, and failing to meet one criterion doesn't disprove it. Temporality is the only truly essential requirement. The rest add weight to the argument.

Concept of causality in epidemiology, Frontiers | Vector-Borne Disease Intelligence: Strategies to Deal with Disease Burden and Threats

Limitations of Observational Studies vs. RCTs

Most epidemiologic evidence comes from observational studies, which have inherent limitations when it comes to proving causation.

Observational study limitations:

Confounding — Unmeasured or uncontrolled factors (age, sex, socioeconomic status) can distort the apparent relationship between exposure and outcome.
Selection bias — The study sample may not represent the target population. Volunteer bias is a common example: people who enroll in studies tend to be healthier than those who don't.
Information bias — Data may be inaccurate. In case-control studies, recall bias occurs when people with disease remember exposures differently than healthy controls.
Unknown confounders — You can only adjust for factors you know about and have measured. There may be important variables you haven't considered.
Temporal ambiguity — Cross-sectional studies measure exposure and outcome at the same time, making it impossible to determine which came first.

RCT advantages:

Random allocation balances both known and unknown confounders across groups, which is something no observational study can fully achieve.
Controlled exposure allows researchers to directly assess the causal effect of a specific intervention.
Prospective design guarantees that exposure assignment precedes the outcome, satisfying temporality.
Blinding and standardized protocols minimize selection and information bias.

That said, RCTs aren't always possible. You can't randomly assign people to smoke for 30 years or drink contaminated water. When the exposure is potentially harmful, ethical constraints make observational studies the only option, and Hill's criteria become especially important for evaluating the evidence.

Counterfactuals in Causal Inference

The counterfactual framework is the theoretical foundation for modern causal inference. The core idea: to know whether an exposure caused an outcome in a person, you'd need to compare what actually happened to what would have happened if that person hadn't been exposed. That second scenario is the counterfactual.

For example, if a patient took a drug and recovered, the counterfactual question is: would they have recovered without the drug? The causal effect is the difference between the factual outcome and the counterfactual outcome.

The fundamental problem is that you can never directly observe the counterfactual. A person either received the treatment or didn't. You can't rewind time and try the other option. This is why epidemiologists rely on group comparisons and statistical methods to estimate what the counterfactual outcome would have been.

Several methods attempt to approximate counterfactual comparisons:

Propensity score matching — Pairs exposed and unexposed individuals who are similar on measured characteristics, creating groups that mimic what randomization would produce.
Instrumental variable analysis — Uses an external factor (the "instrument") that affects exposure but doesn't directly affect the outcome, effectively mimicking random assignment.
Difference-in-differences — Compares the change in outcomes over time between a group that became exposed and a group that didn't, controlling for pre-existing trends.

Each of these methods relies on assumptions that can't be fully verified, which is why understanding their limitations matters as much as understanding how they work.

2,589 studying →