Causation

In AP Statistics, causation is a relationship where changing one variable directly produces a change in another. A strong correlation (even r near 1 or -1) never proves causation on its own; only a well-designed randomized experiment lets you make a cause-and-effect claim.

Verified for the 2027 AP Statistics examLast updated June 2026

What is Causation?

Causation means one variable actually makes another variable change. If you increase the dose of a drug and blood pressure drops because of the drug, that's causation. Compare that to correlation, which just says two variables move together. Tall parents tend to have tall kids, ice cream sales rise with drowning deaths, study time tracks with anxiety levels. Some of those pairs involve a real causal link, some don't, and the correlation coefficient cannot tell you which is which.

The CED makes this explicit in Topic 2.5 (learning objective AP Stats 2.5.B): a perceived or real relationship between two variables does not mean that changes in one variable cause changes in the other. The usual culprit when correlation looks like causation is a confounding variable, a third variable lurking in the background that drives both. Hot weather causes both ice cream sales and drownings. The only way to rule out confounders is random assignment in an experiment, which is why the phrase "correlation does not imply causation" is one of the most-tested ideas in the whole course.

Why Causation matters in AP Statistics

Causation lives at the intersection of Unit 2 (Exploring Two-Variable Data) and the design ideas that run through the rest of the course. In Topic 2.5, learning objectives AP Stats 2.5.A and AP Stats 2.5.B have you calculate and interpret r, and the essential knowledge directly states that correlation does not necessarily imply causation. In Topic 2.2 (AP Stats 2.2.A), two-way tables and segmented bar graphs can show that two categorical variables are associated, but association is the categorical-variable cousin of correlation, and it carries the same warning label. By Unit 9 (Topic 9.1, AP Stats 9.1.A), you're doing formal inference on the slope of a regression line, and even a statistically significant slope only proves a real linear association exists, not that x causes y. Causation is the boundary line for every conclusion you write on this exam. Cross it without a randomized experiment and you lose points.

How Causation connects across the course

Correlation (Unit 2)

Correlation is the thing that gets mistaken for causation. The coefficient r measures direction and strength of a linear association, and that's all. An r of -0.85 between pirate counts and global temperatures is a perfect example of a strong correlation with zero causal meaning.

Confounding Variable (Units 2-3)

A confounding variable is the usual reason correlation isn't causation. It's a hidden third variable that influences both of the variables you measured, creating a relationship that looks causal but isn't. Whenever an AP question asks why you can't conclude causation, "a confounding variable could explain the association" is almost always the answer.

Experimental Design (Unit 3)

Random assignment is the only ticket to a causal claim. Randomly assigning treatments balances out potential confounders across groups, so a difference in outcomes can be attributed to the treatment itself. Observational studies, no matter how large, only support claims of association.

Inference for Slopes (Unit 9)

In Topic 9.1 you test whether the slope of a regression line is genuinely nonzero or just random scatter. Rejecting the null tells you the linear association is real, not a fluke. It still doesn't tell you x causes y unless the data came from a randomized experiment.

Is Causation on the AP Statistics exam?

Causation shows up most often as the wrong answer. A classic multiple-choice stem gives you a strong correlation, like r = -0.85 between study time and exam anxiety, then offers an interpretation claiming one variable causes the other. The correct choice is always the one that sticks to association and refuses the causal leap. Some questions flip it around and ask which statistical principle a causal claim violates (answer: correlation does not imply causation). On FRQs, the danger is in your own writing. If you describe an observational study's results and write "x causes y" or "x increases y," you can lose credit even if everything else is right. Safe phrasing sounds like "there is a strong negative linear association between x and y." Only say "cause" when the problem explicitly describes random assignment to treatments.

Causation vs Correlation

Correlation says two quantitative variables move together in a linear pattern; causation says one variable actually produces the change in the other. Correlation is a number you compute from data (r, between -1 and 1). Causation is a conclusion you can only earn through study design, specifically random assignment in an experiment. Even r = -1, a perfect linear association, does not by itself prove cause and effect. The shortcut to remember: r measures the pattern, the design earns the causal claim.

Key things to remember about Causation

  • Causation means one variable directly produces a change in another, while correlation only means two variables move together.

  • The CED states it directly in Topic 2.5: a perceived or real relationship between two variables does not mean changes in one cause changes in the other.

  • Even a correlation of exactly 1 or -1 proves a perfect linear association, not a cause-and-effect relationship.

  • Confounding variables are the typical reason a correlation exists without causation, because a hidden third variable can drive both measured variables.

  • You can only claim causation when treatments were randomly assigned in an experiment, never from an observational study.

  • On FRQs, write "association" instead of "causes" unless the study described uses random assignment, or you risk losing points.

Frequently asked questions about Causation

What is causation in AP Stats?

Causation is a relationship where changing one variable directly produces a change in another. In AP Stats it appears mainly in Topic 2.5, where the CED states that correlation does not necessarily imply causation.

Does a strong correlation like r = 0.9 prove causation?

No. The correlation coefficient r only measures the direction and strength of a linear association between two quantitative variables. Even r = 1 or r = -1 doesn't prove cause and effect, because a confounding variable could be driving both.

What's the difference between correlation and causation?

Correlation is a computed number (always between -1 and 1) describing how two variables move together linearly. Causation is a conclusion that one variable produces change in the other, and it can only be justified by a randomized experiment, not by the strength of r.

When can you actually conclude causation in statistics?

Only when the data come from a well-designed experiment with random assignment of treatments. Random assignment balances confounding variables across groups, so a difference in outcomes can be attributed to the treatment. Observational studies only support association.

How do I avoid losing points on FRQs about causation?

Check the study design first. If it's observational, describe a "strong negative linear association" or similar, and never write that one variable causes, increases, or reduces the other. Causal language is only safe when the problem explicitly says subjects were randomly assigned to treatments.