In AP Statistics, an explanatory variable is the variable whose values are used to explain or predict corresponding values of the response variable. In bivariate quantitative data, it goes on the x-axis of a scatterplot and serves as the input (x) in a regression model.
An explanatory variable is the "input" half of a two-variable relationship. Per the CED's essential knowledge for Topic 2.4, it's a variable whose values are used to explain or predict corresponding values for the response variable. Think of it as the variable you'd plug in to make a prediction. If you're studying engine size and fuel efficiency, engine size is the explanatory variable because you're using it to predict mpg, not the other way around.
On a scatterplot, the explanatory variable always goes on the x-axis. That placement isn't just convention. Everything downstream depends on it. The least-squares regression line predicts y from x, so swapping which variable is explanatory changes the equation, the predictions, and the interpretation of slope. One thing to keep straight, though: calling something "explanatory" does NOT mean it causes the response. In observational data, it just means it's the predictor you chose. Causation requires a well-designed experiment, which is a Unit 3 problem.
This term lives in Unit 2 (Exploring Two-Variable Data), specifically Topic 2.4. It directly supports learning objective 2.4.A, which asks you to represent bivariate quantitative data using scatterplots, and it sets up 2.4.B, where you describe a scatterplot's form, direction, strength, and unusual features. You literally cannot build a correct scatterplot without first deciding which variable is explanatory, because that decision determines the axes. The same decision carries through the rest of Unit 2: correlation describes the strength and direction of the relationship, and regression uses the explanatory variable as x in ŷ = a + bx. Get the explanatory/response assignment wrong and every interpretation that follows (slope, predictions, residuals) is backwards.
Keep studying AP Statistics Unit 2
Response Variable (Unit 2)
These two are a matched pair. The explanatory variable does the predicting, the response variable gets predicted. Every scatterplot, correlation, and regression problem starts by sorting out which is which, usually from the wording of the research question.
Regression Analysis (Unit 2)
The explanatory variable is the x in the least-squares regression line ŷ = a + bx. When you interpret slope on the exam, you say it in terms of the explanatory variable, as in "for each additional liter of engine size, predicted mpg decreases by b."
Correlation Coefficient (Unit 2)
Here's a fun wrinkle. The correlation r is the one Unit 2 statistic that doesn't care which variable is explanatory; r is the same either way. The regression line, on the other hand, changes completely if you swap x and y.
Experimental Design (Unit 3)
In an experiment, the explanatory variable becomes the factor the researchers actively manipulate. The 2019 FRQ about fungus concentrations and tree-destroying insects is exactly this. The concentration is the explanatory variable, and only in an experiment can you claim it causes changes in the response.
Multiple-choice questions test this term in two main ways. First, axis assignment: given a scenario like 50 cars with engine size and fuel efficiency, you pick which variable belongs on the x-axis (the explanatory one, based on which variable is being used to predict the other). Second, identification: given a context like CO₂ levels and global temperature, or temperature and ice cream sales, you identify which variable is explanatory and which is the response. Some questions layer correlation on top, like interpreting what r = -0.78 says about the relationship between exercise duration (explanatory) and recovery time (response). On FRQs, the term shows up in regression interpretation (your slope sentence must reference the explanatory variable in context) and in experimental design questions like 2019 FRQ Q2, where the variable being manipulated (fungus concentration) is the explanatory variable. The most common point-loser is sloppy causal language, so say "predicts" or "is associated with" unless the data come from a randomized experiment.
The explanatory variable explains or predicts; the response variable is what gets explained or predicted. Explanatory goes on the x-axis, response goes on the y-axis. A quick test: ask which variable you'd plug in to make a prediction about the other. If you're predicting ice cream sales from temperature, temperature is explanatory and sales is the response. Note that neither label implies causation. In an observational study, "explanatory" just means "the predictor we picked."
An explanatory variable is a variable whose values are used to explain or predict corresponding values for the response variable.
On a scatterplot, the explanatory variable always goes on the x-axis and the response variable goes on the y-axis.
In the regression equation ŷ = a + bx, x is the explanatory variable, so swapping explanatory and response gives you a completely different line.
The correlation coefficient r is the same no matter which variable you call explanatory, but the regression line is not.
Calling a variable explanatory does not mean it causes the response; causation can only be claimed from a well-designed randomized experiment.
In an experiment, the explanatory variable is the factor researchers deliberately manipulate, like the fungus concentrations in the 2019 FRQ.
It's a variable whose values are used to explain or predict corresponding values for the response variable. In bivariate quantitative data, it goes on the x-axis of a scatterplot and serves as x in the regression equation ŷ = a + bx.
No, not necessarily. "Explanatory" only means the variable is being used to predict the other one. You can only claim causation when the explanatory variable was randomly assigned in an experiment, which is why AP graders want language like "associated with" or "predicts" for observational data.
The explanatory variable is the predictor (x-axis) and the response variable is the outcome being predicted (y-axis). For example, when predicting fuel efficiency from engine size, engine size is explanatory and mpg is the response.
Yes. When one variable is identified as explanatory, it goes on the x-axis and the response goes on the y-axis. This matters because the least-squares regression line predicts y from x, so the axis assignment determines the whole model.
Not for correlation. The value of r is identical no matter which variable you treat as explanatory. But it matters a lot for regression, because swapping x and y produces a different line with different predictions.
Connect this key term to the AP exam workflow: review the course, practice questions, and check related study tools.
Review units, study guides, and course resources.
Check this vocabulary in multiple-choice context.
Apply key concepts in written AP responses.
Estimate the exam score you are working toward.
Review the highest-yield facts before practice.
Put the full course together before test day.