Residual analysis is the process of examining residuals (observed y minus predicted ŷ) with plots and summaries to check the conditions for regression inference, especially that the true relationship is linear and that the standard deviation of y stays roughly constant across all values of x.
A residual is the leftover error for one data point. It's the observed y-value minus the value your regression line predicted, written yi − ŷi. Residual analysis means looking at all those leftovers together, usually in a residual plot, to judge whether a linear model is actually trustworthy.
In Unit 9, residual analysis gets a specific job. Before you build a confidence interval for the slope of a regression line, the CED says you have to verify conditions, and residuals are your evidence for two of them. First, the residual plot should show no curved pattern, which supports the condition that the true relationship between x and y is linear. Second, the vertical spread of the residuals should look roughly the same from left to right, which supports the condition that the standard deviation of y does not vary with x (constant variance). If the residual plot fans out, funnels in, or bends, the t-interval for the slope isn't valid.
Residual analysis lives in Topic 9.2 (Confidence Intervals for the Slope of a Regression Model) and directly supports learning objective AP Stats 9.2.B, verifying the conditions to calculate a confidence interval for the slope. The essential knowledge spells it out twice. Analysis of residuals may be used to verify linearity, and analysis of residuals may be used to check for approximately equal standard deviations for all x. In other words, when an inference problem hands you a residual plot, it's not decoration. It's the tool you cite to justify that the interval b ± t*(SEb) from AP Stats 9.2.D is legitimate. The same logic carries into significance tests for slope in Topics 9.3-9.6, so this one skill pays off across all of Unit 9.
Keep studying AP® Statistics Unit 9
Least-Squares Regression and Residuals (Unit 2)
You first met residuals in Unit 2, where a patternless residual plot told you a line was an appropriate model. Unit 9 recycles that exact skill, but now the stakes are higher. A bad residual plot doesn't just mean a poor fit, it means your confidence interval for the slope is invalid.
Constant Variance (Unit 9)
Equal spread is the condition residual analysis is built to catch. If the residual plot fans out as x increases, the standard deviation of y is changing with x, and the constant variance condition fails. Look for a band of points with roughly even vertical width across the whole plot.
SEb, the Standard Error of the Slope (Unit 9)
SEb is computed using s, an estimate of the common standard deviation of y around the line. That formula only makes sense if there IS one common standard deviation, which is exactly what residual analysis verifies. The conditions and the calculation are two halves of the same procedure.
Random Sample (Unit 3)
Residual plots can't check everything. Independence comes from how the data were collected (a random sample or randomized experiment, plus the 10% condition when sampling without replacement). On the exam, check residuals for linearity and equal spread, but check the study design for independence.
Multiple-choice questions often show you a residual plot and ask which condition it supports or violates, or describe a study setup and ask you to assess conditions, like a researcher sampling 500 customers from 2,000 without replacement (that one fails the n ≤ 10% of N check, since 500 is 25% of the population). On FRQs, regression inference questions typically award credit for naming the conditions and pointing to specific evidence, so write things like "the residual plot shows no curved pattern, so the linearity condition is met" rather than just "conditions are satisfied." Vague condition-checking is one of the most common ways students lose FRQ points in Unit 9.
Same plot, different purpose. In Unit 2, you read a residual plot to decide whether a linear model fits the sample data well. In Unit 9, residual analysis is a formal condition check for inference. You're verifying that the true relationship is linear AND that the spread of residuals is roughly constant across x, so the t-procedures for the slope are valid. Unit 2 asks "is a line a good description?" Unit 9 asks "can I trust this interval?"
A residual is observed minus predicted, yi − ŷi, and residual analysis means examining all the residuals to check whether regression inference is justified.
A residual plot with no curved pattern supports the linearity condition for a confidence interval for the slope.
Roughly equal vertical spread of residuals across all x-values supports the constant variance condition, that the standard deviation of y does not vary with x.
Residual analysis cannot verify independence. That condition is checked through random sampling or random assignment plus the 10% condition.
On FRQs, cite specific features of the residual plot when checking conditions, because writing 'conditions are met' with no evidence earns no credit.
A fan or funnel shape in the residual plot signals changing variability, which invalidates the b ± t*(SEb) interval.
It's examining the residuals (observed y minus predicted ŷ) from a regression line, usually with a residual plot, to verify the conditions for inference about the slope. A patternless plot with even spread supports the linearity and constant variance conditions in Topic 9.2.
No. A patternless, evenly-spread residual plot supports linearity and constant variance, but independence depends on the data collection (random sample or randomized experiment, and n ≤ 10% of N when sampling without replacement). You can't see independence in a residual plot.
The plot is the same, but in Unit 2 you used it to decide whether a line fits the data well. In Unit 9, residual analysis is a required condition check that determines whether your confidence interval b ± t*(SEb) is even valid.
It means the spread of the residuals changes as x changes, so the standard deviation of y varies with x. That violates the constant variance condition, and a t-interval for the slope would not be appropriate.
Two of them. A lack of curved pattern checks linearity, and roughly equal vertical spread checks constant variance. Independence comes from study design, and approximate normality of responses at each x is a separate condition you often assess from the residuals' distribution or assume from the problem setup.
Connect this key term to the AP exam workflow: review the course, practice questions, and check related study tools.
Review units, study guides, and course resources.
Check this vocabulary in multiple-choice context.
Apply key concepts in written AP responses.
Estimate the exam score you are working toward.
Review the highest-yield facts before practice.
Put the full course together before test day.