In AP Stats, r² (the coefficient of determination) is the square of the correlation r, and it measures the proportion of variation in the response variable (y) that is explained by the linear relationship with the explanatory variable (x).
r², also called the coefficient of determination, is exactly what it sounds like in simple linear regression. You take the correlation coefficient r and square it. The result tells you what fraction of the variation in the response variable is explained by the least-squares regression line using the explanatory variable. If r² = 0.81, then 81% of the variation in y is explained by the linear relationship with x. The other 19% is variation the line doesn't account for.
Here's the intuition. The y-values in your data bounce around their mean. Some of that bouncing is predictable (taller people tend to weigh more), and some is just scatter. r² tells you how much of the total bouncing your line actually captures. An r² close to 1 means the line explains most of the variation; an r² close to 0 means knowing x barely helps you predict y. What r² does NOT tell you is how many points sit exactly on the line. That misreading is one of the most common wrong answers on AP multiple choice.
r² lives in Unit 2: Exploring Two-Variable Data, specifically Topic 2.8 (Least Squares Regression) and Topic 2.9 (Analyzing Departures from Linearity). It supports learning objective AP Stats 2.8.A, where the essential knowledge states directly that r² is the square of the correlation r and is called the coefficient of determination. It also shows up under AP Stats 2.9.B, where r² moving closer to 1 after transforming data is evidence that the transformed model fits better. The exam loves r² because interpreting it correctly requires precise statistical language. One sloppy word ('36% of the points' instead of '36% of the variation') turns a right answer into a wrong one. Mastering the interpretation sentence here pays off again when you assess model fit throughout regression problems.
Keep studying AP® Statistics Unit 5
Correlation Coefficient (Unit 2)
r² is literally r squared, but squaring loses information. The sign disappears, so r² = 0.36 could come from r = 0.6 or r = -0.6. You need the scatterplot or slope to recover the direction of the relationship.
Transformations and Departures from Linearity (Unit 2)
In Topic 2.9, r² becomes a model-comparison tool. If you log-transform y and r² jumps closer to 1 (along with a more random residual plot), that's evidence the transformed model predicts better than the original linear one. The 2022 FRQ on bullfrog length and mass tested exactly this kind of model choice.
Influential Point (Unit 2)
A single influential point can drag r, and therefore r², up or down dramatically. A high r² doesn't guarantee a good model if one high-leverage point is doing all the work, which is why you always check the scatterplot and residual plot too.
Response Variable (Unit 2)
Every correct r² interpretation names the response variable in context. It's the variation in y, the thing you're predicting, that gets explained. 'Explains 81% of the variation in weight' scores; 'explains 81% of the data' doesn't.
On multiple choice, r² questions are interpretation traps. A typical stem gives you something like ŷ = 62 + 3.8x with r² = 0.36 and asks which interpretation is correct. The wrong answers say things like '36% of the points fall on the line' or '36% of test scores are predicted by study time.' The correct answer always follows the template: '36% of the variation in [response variable, in context] is explained by the linear relationship with [explanatory variable].' On FRQs, r² appears in regression output you have to read and use. The 2018 FRQ gave computer output for checkout time versus number of customers, and the 2022 FRQ used r² as evidence when comparing a linear model to a transformed model for bullfrog data. Know the template sentence cold, always include context, and remember that comparing r² values between models (especially after a transformation) is a legitimate way to argue one model fits better.
r measures the direction and strength of a linear relationship and runs from -1 to 1. r² measures the proportion of variation in y explained by the model and runs from 0 to 1. Squaring kills the sign, so r² alone can't tell you whether the relationship is positive or negative. Also watch the interpretations: r is about strength and direction, while r² is about explained variation. Don't say r² is 'strong' and don't say r 'explains' anything.
r², the coefficient of determination, is the square of the correlation r and measures the proportion of variation in the response variable explained by the linear relationship with the explanatory variable.
The full-credit interpretation sentence is: '[r² as a percent] of the variation in [response variable, in context] is explained by the linear relationship with [explanatory variable].'
r² does NOT mean that percentage of points fall on the regression line, and it does not measure how many predictions are correct.
Because squaring erases the sign, r² = 0.36 could come from r = 0.6 or r = -0.6, so you can't determine direction from r² alone.
In Topic 2.9, an r² closer to 1 after transforming data (paired with a more random residual plot) is evidence the transformed model is a better fit.
A high r² can be inflated by a single influential point, so always check the scatterplot and residual plot before trusting the model.
r² is the coefficient of determination, the square of the correlation coefficient r. It gives the proportion of variation in the response variable that is explained by the linear relationship with the explanatory variable, per AP Stats 2.8.A.
No, and this is the classic trap answer. r² = 0.36 means 36% of the variation in the response variable is explained by the linear relationship with the explanatory variable. It says nothing about how many points sit on the line.
r measures direction and strength of a linear relationship (from -1 to 1), while r² measures the proportion of variation in y explained by the model (from 0 to 1). Squaring removes the sign, so r² = 0.81 could come from r = 0.9 or r = -0.9.
Use the template: '[percent] of the variation in [response variable, in context] is explained by the linear relationship with [explanatory variable].' For r² = 0.81 relating height to weight, say 81% of the variation in weight is explained by the linear relationship with height.
Not by itself. A nonlinear pattern or a single influential point can still produce a high r², so you also need a residual plot with no pattern. In Topic 2.9, you compare r² values alongside residual plots to decide whether a transformed model fits better, as the 2022 FRQ on bullfrog data required.
Connect this key term to the AP exam workflow: review the course, practice questions, and check related study tools.
Review units, study guides, and course resources.
Check this vocabulary in multiple-choice context.
Apply key concepts in written AP responses.
Estimate the exam score you are working toward.
Review the highest-yield facts before practice.
Put the full course together before test day.