Correlation does not imply causation is the AP Statistics principle that an observed association between two variables does not prove one variable causes the other, because confounding variables or coincidence may explain the relationship. Only a well-designed experiment with random assignment can establish causation.
Correlation does not imply causation is the warning label of statistics. When two variables move together (more of one tends to come with more of the other), that's an association. It does NOT mean one variable is making the other happen. Ice cream sales and drowning deaths rise together every summer, but ice cream isn't pulling anyone underwater. A third variable, hot weather, drives both.
In Topic 2.2, you build two-way tables, side-by-side bar graphs, segmented bar graphs, and mosaic plots to decide whether two categorical variables are associated (UNC-1.P.2). Finding an association is real and useful, but it's only step one. The data alone can't tell you why the variables are linked. The link could be causal, it could run through a confounding variable, or it could be a coincidence in the sample. That's why AP graders expect careful language. You can say "there appears to be an association," but you cannot say "X causes Y" from observational data.
This principle lives in Unit 2: Exploring Two-Variable Data, especially Topic 2.2, where learning objective 2.2.A has you compare numerical and graphical representations for two categorical variables. Spotting an association in a segmented bar graph or two-way table is a core skill, but interpreting it correctly is what separates a 3 from a 5. The exam loves answer choices and FRQ prompts that bait you into causal language when the data only supports association. The same logic carries the whole course. It explains why correlation coefficients later in Unit 2 describe strength of association (not cause), and it sets up Unit 3, where random assignment in experiments is the only design that lets you conclude causation.
Keep studying AP Statistics Unit 2
Confounding Variable (Unit 3)
A confounding variable is usually the reason correlation isn't causation. It's a third variable tangled up with both the explanatory and response variables, like hot weather driving both ice cream sales and drownings. If you can name a plausible confounder, you've explained why the causal claim fails.
Causal Relationship (Unit 3)
Causation isn't impossible to establish, it just requires the right study design. Random assignment in an experiment balances out confounding variables across groups, so a difference in outcomes can be attributed to the treatment. Observational data, no matter how strong the association, can't do that.
Correlation Coefficient (Unit 2)
Later in Unit 2, r measures the strength and direction of a linear association between two quantitative variables. An r of 0.95 is a very strong correlation and still says nothing about causation. Strength of association and evidence of causation are completely separate questions.
Spurious Correlation (Unit 2)
A spurious correlation is what this principle warns you about in action. Two variables look related, but the link is driven by a lurking variable or pure coincidence. Per-capita cheese consumption correlating with anything weird on the internet is a spurious correlation.
Multiple-choice questions test this by giving you an observational study with a clear association, then offering a causal conclusion as a tempting wrong answer. The correct choice usually says the data shows an association but a confounding variable could explain it. On FRQs, this shows up as a "can we conclude..." question after you've analyzed a two-way table or scatterplot. The full-credit answer states (1) no, causation cannot be concluded, (2) why (it was observational, no random assignment), and often (3) a plausible confounding variable. Practice questions ask exactly this: why is it important to remember correlation does not imply causation? Your answer should hit confounding variables and the limits of observational data.
Correlation (or association) means two variables are statistically linked in the data you observed. A causal relationship means changing one variable actually produces a change in the other. The first is a pattern; the second is a mechanism. On the AP exam, you can only claim a causal relationship when the data comes from a randomized experiment, because random assignment is what rules out confounding variables. Observational data, even with a near-perfect correlation, only ever earns you the word "association."
An association between two variables, whether in a two-way table, segmented bar graph, or scatterplot, never proves that one variable causes the other.
Confounding variables are the usual culprit, since a third variable can drive both variables and create the appearance of a direct relationship.
Only a randomized experiment with random assignment of treatments can justify a causal conclusion on the AP exam.
A strong correlation coefficient (like r = 0.95) measures the strength of a linear association, not evidence of causation.
On FRQs, use the phrase "there is an association" rather than "X causes Y" when interpreting observational data, and name a plausible confounding variable when asked why causation can't be concluded.
It means that finding a relationship between two variables doesn't prove one causes the other. A confounding variable could be driving both, or the link could be coincidence. In AP Stats, you can only claim causation from a randomized experiment.
No. Even a correlation coefficient near 1 or -1 only measures how tightly two variables move together. The classic example is ice cream sales and drownings, which are strongly correlated because hot weather drives both. Strength of association and proof of causation are separate questions.
An association is a statistical pattern, where knowing one variable's value tells you something about the other. A causal relationship means changing one variable actually changes the other. Observational studies can show associations; only experiments with random assignment can demonstrate causation.
Random assignment spreads confounding variables roughly evenly across treatment groups, so any difference in outcomes can be attributed to the treatment itself. In observational studies, nobody controls who ends up in which group, so confounders are always a possible explanation.
Use association language, not causal language, when the data is observational. If asked whether causation can be concluded, say no, explain that the study was observational without random assignment, and identify a plausible confounding variable.
Connect this key term to the AP exam workflow: review the course, practice questions, and check related study tools.
Review units, study guides, and course resources.
Check this vocabulary in multiple-choice context.
Apply key concepts in written AP responses.
Estimate the exam score you are working toward.
Review the highest-yield facts before practice.
Put the full course together before test day.