Residual Plot

A residual plot graphs residuals (y - ŷ) on the vertical axis against the explanatory variable or predicted values on the horizontal axis; random scatter with no pattern is evidence that a linear model is appropriate, while curves or fanning signal the model doesn't fit.

Verified for the 2027 AP Statistics examLast updated June 2026

What is the Residual Plot?

A residual plot is the diagnostic check for your regression line. Every residual is the gap between what actually happened and what the line predicted (residual = y - ŷ). Plot those gaps against the explanatory variable values or the predicted values, and you can see whether your line is systematically missing in some region of the data.

Here's the logic. If a linear model truly fits, the line's misses should be random noise, some above zero, some below, with no structure. So apparent randomness in the residual plot is evidence the association is linear (EK under 2.7.B). A pattern is the model confessing. A U-shape means the relationship is actually curved and a linear model is wrong. A fan shape (residuals spreading out as predicted values grow) means the variability in y isn't constant across x. The residual plot doesn't tell you the model is good; it tells you when the model is bad.

Why the Residual Plot matters in AP Statistics

Residual plots live in Topic 2.7 (Unit 2: Exploring Two-Variable Data), directly supporting learning objectives 2.7.A (represent differences between measured and predicted responses using residual plots) and 2.7.B (describe the form of association using residual plots). But they come back with higher stakes in Topic 9.4. Before you run a t-test for the slope of a regression model, LO 9.4.C requires you to verify conditions, and the residual plot is your tool for two of them. Random scatter checks that the true relationship between x and y is linear, and roughly even vertical spread checks that the standard deviation of y doesn't vary with x. One graph, learned in Unit 2, becomes the gatekeeper for inference in Unit 9.

How the Residual Plot connects across the course

Residuals (Unit 2)

The residual plot is just every residual (y - ŷ) graphed at once. A single residual tells you about one point; the plot tells you whether the line is missing in a systematic way across the whole dataset.

Homoscedasticity (Units 2 & 9)

Equal spread of residuals across all x-values is the equal-SD condition for slope inference. A fan shape in the residual plot is exactly how you spot a violation, which is why one practice question hands you residuals that 'increase in spread as predicted values increase.'

Setting Up a Test for the Slope (Unit 9)

The t-test for a slope (LO 9.4.A) is only valid if the conditions in 9.4.C hold. Analysis of residuals verifies both linearity and constant standard deviation, so a quick residual plot is part of the 'check conditions' step on inference FRQs.

Regression Model (Unit 2)

Choosing a model and checking a model are different jobs. The least-squares line gets fit no matter what; the residual plot is what tells you whether that line was the right choice or whether the data wanted a curve.

Is the Residual Plot on the AP Statistics exam?

Residual plots show up mostly as pattern-reading questions. You're shown (or told about) a plot and asked what it implies about the model. Random scatter means the linear model is appropriate, a U-shape or sinusoidal pattern means the relationship is nonlinear and the linear model is a poor fit, and increasing spread means the standard deviation of y changes with x. Practice questions hit all of these, including a trick case where every residual is exactly zero, meaning the line passes through every point perfectly. On FRQs, residual plots earn points two ways. In Unit 2 territory, you interpret a plot to justify whether a linear model is appropriate. In Unit 9 inference problems, you cite the residual plot when verifying the linearity and equal-SD conditions before a t-test for slope. Either way, the move is the same. Describe the pattern (or lack of one), then state what it implies about the model, in context.

The Residual Plot vs Scatterplot of the original data

A scatterplot shows the raw (x, y) data and lets you eyeball the relationship. A residual plot shows what's left over after you subtract the line's predictions, which magnifies any pattern the line failed to capture. A curve that looks 'pretty linear' in a scatterplot can show an obvious U-shape in the residual plot. That's the whole point of making one. Also note the axes differ. A residual plot puts residuals on the vertical axis, not y.

Key things to remember about the Residual Plot

  • A residual plot graphs residuals (y - ŷ) against the explanatory variable values or the predicted values.

  • Apparent randomness in the residual plot is evidence that the association between the variables is linear and a linear model is appropriate.

  • A curved pattern, like a U-shape, means the true relationship is nonlinear and the linear model is a poor fit.

  • Residuals that fan out (spread increasing with predicted values) mean the standard deviation of y is not constant across x.

  • In Unit 9, the residual plot is how you verify the linearity and equal-standard-deviation conditions before running a t-test for the slope.

  • A 'good' residual plot is boring. Structure of any kind is the model telling you something is wrong.

Frequently asked questions about the Residual Plot

What is a residual plot in AP Stats?

It's a graph of residuals (actual y minus predicted ŷ) plotted against the explanatory variable or the predicted values. You use it to judge whether a linear regression model actually fits the data, which is LO 2.7.A and 2.7.B in the CED.

Does a random residual plot prove the linear model is correct?

Not quite. Random scatter is evidence that a linear model is appropriate, and the CED words it exactly that way. Patterns can rule a model out, but randomness can't prove a model is the one true relationship.

What does a U-shaped residual plot mean?

It means the relationship between x and y is curved, not linear, so the line systematically underpredicts in some regions and overpredicts in others. On the exam, that's your cue to say the linear model is not appropriate.

How is a residual plot different from a scatterplot?

A scatterplot shows the original (x, y) data; a residual plot shows the leftover errors after fitting the line, with residuals on the vertical axis. Residual plots amplify problems, so a subtle curve invisible in the scatterplot often jumps out in the residual plot.

What does it mean if residuals fan out as predicted values increase?

The variability of y is growing with x, so the standard deviation of y is not constant. That violates the equal-SD condition (homoscedasticity) you have to verify for a t-test for slope under LO 9.4.C.