In AP Statistics, causation is a relationship where changing one variable directly produces a change in another. A strong correlation (even r near 1 or -1) never proves causation on its own; only a well-designed randomized experiment lets you make a cause-and-effect claim.
Causation means one variable actually makes another variable change. If you increase the dose of a drug and blood pressure drops because of the drug, that's causation. Compare that to correlation, which just says two variables move together. Tall parents tend to have tall kids, ice cream sales rise with drowning deaths, study time tracks with anxiety levels. Some of those pairs involve a real causal link, some don't, and the correlation coefficient cannot tell you which is which.
The CED makes this explicit in Topic 2.5 (learning objective AP Stats 2.5.B): a perceived or real relationship between two variables does not mean that changes in one variable cause changes in the other. The usual culprit when correlation looks like causation is a confounding variable, a third variable lurking in the background that drives both. Hot weather causes both ice cream sales and drownings. The only way to rule out confounders is random assignment in an experiment, which is why the phrase "correlation does not imply causation" is one of the most-tested ideas in the whole course.
Causation lives at the intersection of Unit 2 (Exploring Two-Variable Data) and the design ideas that run through the rest of the course. In Topic 2.5, learning objectives AP Stats 2.5.A and AP Stats 2.5.B have you calculate and interpret r, and the essential knowledge directly states that correlation does not necessarily imply causation. In Topic 2.2 (AP Stats 2.2.A), two-way tables and segmented bar graphs can show that two categorical variables are associated, but association is the categorical-variable cousin of correlation, and it carries the same warning label. By Unit 9 (Topic 9.1, AP Stats 9.1.A), you're doing formal inference on the slope of a regression line, and even a statistically significant slope only proves a real linear association exists, not that x causes y. Causation is the boundary line for every conclusion you write on this exam. Cross it without a randomized experiment and you lose points.
Keep studying AP Statistics Unit 2
Correlation (Unit 2)
Correlation is the thing that gets mistaken for causation. The coefficient r measures direction and strength of a linear association, and that's all. An r of -0.85 between pirate counts and global temperatures is a perfect example of a strong correlation with zero causal meaning.
Confounding Variable (Units 2-3)
A confounding variable is the usual reason correlation isn't causation. It's a hidden third variable that influences both of the variables you measured, creating a relationship that looks causal but isn't. Whenever an AP question asks why you can't conclude causation, "a confounding variable could explain the association" is almost always the answer.
Experimental Design (Unit 3)
Random assignment is the only ticket to a causal claim. Randomly assigning treatments balances out potential confounders across groups, so a difference in outcomes can be attributed to the treatment itself. Observational studies, no matter how large, only support claims of association.
Inference for Slopes (Unit 9)
In Topic 9.1 you test whether the slope of a regression line is genuinely nonzero or just random scatter. Rejecting the null tells you the linear association is real, not a fluke. It still doesn't tell you x causes y unless the data came from a randomized experiment.
Causation shows up most often as the wrong answer. A classic multiple-choice stem gives you a strong correlation, like r = -0.85 between study time and exam anxiety, then offers an interpretation claiming one variable causes the other. The correct choice is always the one that sticks to association and refuses the causal leap. Some questions flip it around and ask which statistical principle a causal claim violates (answer: correlation does not imply causation). On FRQs, the danger is in your own writing. If you describe an observational study's results and write "x causes y" or "x increases y," you can lose credit even if everything else is right. Safe phrasing sounds like "there is a strong negative linear association between x and y." Only say "cause" when the problem explicitly describes random assignment to treatments.
Correlation says two quantitative variables move together in a linear pattern; causation says one variable actually produces the change in the other. Correlation is a number you compute from data (r, between -1 and 1). Causation is a conclusion you can only earn through study design, specifically random assignment in an experiment. Even r = -1, a perfect linear association, does not by itself prove cause and effect. The shortcut to remember: r measures the pattern, the design earns the causal claim.
Causation means one variable directly produces a change in another, while correlation only means two variables move together.
The CED states it directly in Topic 2.5: a perceived or real relationship between two variables does not mean changes in one cause changes in the other.
Even a correlation of exactly 1 or -1 proves a perfect linear association, not a cause-and-effect relationship.
Confounding variables are the typical reason a correlation exists without causation, because a hidden third variable can drive both measured variables.
You can only claim causation when treatments were randomly assigned in an experiment, never from an observational study.
On FRQs, write "association" instead of "causes" unless the study described uses random assignment, or you risk losing points.
Causation is a relationship where changing one variable directly produces a change in another. In AP Stats it appears mainly in Topic 2.5, where the CED states that correlation does not necessarily imply causation.
No. The correlation coefficient r only measures the direction and strength of a linear association between two quantitative variables. Even r = 1 or r = -1 doesn't prove cause and effect, because a confounding variable could be driving both.
Correlation is a computed number (always between -1 and 1) describing how two variables move together linearly. Causation is a conclusion that one variable produces change in the other, and it can only be justified by a randomized experiment, not by the strength of r.
Only when the data come from a well-designed experiment with random assignment of treatments. Random assignment balances confounding variables across groups, so a difference in outcomes can be attributed to the treatment. Observational studies only support association.
Check the study design first. If it's observational, describe a "strong negative linear association" or similar, and never write that one variable causes, increases, or reduces the other. Causal language is only safe when the problem explicitly says subjects were randomly assigned to treatments.
Connect this key term to the AP exam workflow: review the course, practice questions, and check related study tools.
Review units, study guides, and course resources.
Check this vocabulary in multiple-choice context.
Apply key concepts in written AP responses.
Estimate the exam score you are working toward.
Review the highest-yield facts before practice.
Put the full course together before test day.