Residual analysis in AP Statistics

Residual analysis is the process of examining residuals (observed y minus predicted ŷ) with plots and summaries to check the conditions for regression inference, especially that the true relationship is linear and that the standard deviation of y stays roughly constant across all values of x.

Verified for the 2027 AP Statistics exam•Last updated June 2026

What is residual analysis?

A residual is the leftover error for one data point. It's the observed y-value minus the value your regression line predicted, written yi − ŷi. Residual analysis means looking at all those leftovers together, usually in a residual plot, to judge whether a linear model is actually trustworthy.

In Unit 5, residual analysis gets a specific job. Before you build a interpretation of the slope of a regression line, the CED says you have to verify conditions, and residuals are your evidence for two of them. First, the residual plot should show no curved pattern, which supports the condition that the true relationship between x and y is linear. Second, the vertical spread of the residuals should look roughly the same from left to right, which supports the condition that the standard deviation of y does not vary with x (constant variance). If the residual plot fans out, funnels in, or bends, the t-interval for the slope isn't valid.

Why residual analysis matters in AP® Statistics

Residual analysis lives in the relevant current topics (interpretation of the slope of a Regression Model) and directly supports learning objective AP Stats the relevant learning objective, verifying the conditions to calculate a interpretation of the slope. The essential knowledge spells it out twice. Analysis of residuals may be used to verify linearity, and analysis of residuals may be used to check for approximately equal standard deviations for all x. In other words, when an inference problem hands you a residual plot, it's not decoration. It's the tool you cite to justify that the interval b ± t*(SEb) from AP Stats the relevant learning objective is legitimate. The same logic carries into significance tests for slope in the relevant current topics-the relevant current topic, so this one skill pays off across all of Unit 5.

Keep studying AP® Statistics Unit CL5B675bCTuba5g2

Visual cheatsheet

view gallery

Practice questions for this topic

Unit CL5B675bCTuba5g2 study guide

Full AP® Statistics practice exam

How residual analysis connects across the course

Least-Squares Regression and Residuals (Unit 2)

You first met residuals in Unit 2, where a patternless residual plot told you a line was an appropriate model. Unit 5 recycles that exact skill, but now the stakes are higher. A bad residual plot doesn't just mean a poor fit, it means your interpretation of the slope is invalid.

Constant Variance (Unit 5)

Equal spread is the condition residual analysis is built to catch. If the residual plot fans out as x increases, the standard deviation of y is changing with x, and the constant variance condition fails. Look for a band of points with roughly even vertical width across the whole plot.

SEb, the Standard Error of the Slope (Unit 5)

SEb is computed using s, an estimate of the common standard deviation of y around the line. That formula only makes sense if there IS one common standard deviation, which is exactly what residual analysis verifies. The conditions and the calculation are two halves of the same procedure.

Random Sample (Unit 3)

Residual plots can't check everything. Independence comes from how the data were collected (a random sample or randomized experiment, plus the 10% condition when sampling without replacement). On the exam, check residuals for linearity and equal spread, but check the study design for independence.

Is residual analysis on the AP® Statistics exam?

Multiple-choice questions often show you a residual plot and ask which condition it supports or violates, or describe a study setup and ask you to assess conditions, like a researcher sampling 500 customers from 2,000 without replacement (that one fails the n ≤ 10% of N check, since 500 is 25% of the population). On FRQs, regression inference questions typically award credit for naming the conditions and pointing to specific evidence, so write things like "the residual plot shows no curved pattern, so the linearity condition is met" rather than just "conditions are satisfied." Vague condition-checking is one of the most common ways students lose FRQ points in Unit 5.

Residual analysis vs Residual plots for model fit (Unit 2)

Same plot, different purpose. In Unit 2, you read a residual plot to decide whether a linear model fits the sample data well. In Unit 5, residual analysis is a formal condition check for inference. You're verifying that the true relationship is linear AND that the spread of residuals is roughly constant across x, so the t-procedures for the slope are valid. Unit 2 asks "is a line a good description?" Unit 5 asks "can I trust this interval?"

Key things to remember about residual analysis

A residual is observed minus predicted, yi − ŷi, and residual analysis means examining all the residuals to check whether regression inference is justified.
A residual plot with no curved pattern supports the linearity condition for a interpretation of the slope.
Roughly equal vertical spread of residuals across all x-values supports the constant variance condition, that the standard deviation of y does not vary with x.
Residual analysis cannot verify independence. That condition is checked through random sampling or random assignment plus the 10% condition.
On FRQs, cite specific features of the residual plot when checking conditions, because writing 'conditions are met' with no evidence earns no credit.
A fan or funnel shape in the residual plot signals changing variability, which invalidates the b ± t*(SEb) interval.

Frequently asked questions about residual analysis

What is residual analysis in AP Stats?

It's examining the residuals (observed y minus predicted ŷ) from a regression line, usually with a residual plot, to verify the conditions for inference about the slope. A patternless plot with even spread supports the linearity and constant variance conditions in the relevant current topics.

Does a good residual plot prove the conditions for regression inference are met?

No. A patternless, evenly-spread residual plot supports linearity and constant variance, but independence depends on the data collection (random sample or randomized experiment, and n ≤ 10% of N when sampling without replacement). You can't see independence in a residual plot.

How is residual analysis different from just making a residual plot in Unit 2?

The plot is the same, but in Unit 2 you used it to decide whether a line fits the data well. In Unit 5, residual analysis is a required condition check that determines whether your confidence interval b ± t*(SEb) is even valid.

What does a fan shape in a residual plot mean?

It means the spread of the residuals changes as x changes, so the standard deviation of y varies with x. That violates the constant variance condition, and a t-interval for the slope would not be appropriate.

Which conditions can residual analysis actually check?

Two of them. A lack of curved pattern checks linearity, and roughly equal vertical spread checks constant variance. Independence comes from study design, and approximate normality of responses at each x is a separate condition you often assess from the residuals' distribution or assume from the problem setup.

Keep studying AP Statistics

Connect this key term to the AP exam workflow: review the course, practice questions, and check related study tools.

AP Statistics hub

Review units, study guides, and course resources.

AP-style practice

Check this vocabulary in multiple-choice context.

FRQ practice

Apply key concepts in written AP responses.

score calculator

Estimate the exam score you are working toward.

cheatsheets

Review the highest-yield facts before practice.

practice exam

Put the full course together before test day.