Residual Plots for Model Assumptions
Residual analysis is how you check whether your multiple regression model actually meets the assumptions it depends on. If those assumptions are violated, your coefficient estimates, p-values, and confidence intervals can all become unreliable. Residual plots are the primary diagnostic tool for catching these problems.
Graphical Representation and Purpose
A residual is the difference between an observed value and the value your model predicted: . Residual plots graph these differences so you can visually inspect whether the model's assumptions hold.
The four main assumptions you're checking:
- Linearity — the relationship between predictors and the response is linear
- Homoscedasticity — the variance of residuals stays constant across all levels of the predicted values
- Independence of errors — residuals aren't correlated with each other
- Normality — residuals follow a normal distribution centered at zero
Creating and Interpreting Residual Plots
You can create residual plots by plotting residuals on the y-axis against several different quantities on the x-axis:
- Predicted values () — the most common choice; good for checking linearity and constant variance overall
- Each independent variable — helps you spot problems tied to a specific predictor
- Order of data collection — useful for detecting autocorrelation in time-ordered data
What you want to see is a random scatter of points centered around zero with no visible pattern. That's a healthy residual plot. What you don't want to see is any systematic shape: curves, fans, clusters, or trends. Those signal assumption violations, which the next sections cover in detail.
Patterns in Residual Plots

Non-Random Patterns and Their Implications
When residuals form a recognizable pattern instead of random scatter, something in your model needs attention.
- Curved pattern (U-shape, S-shape, or other nonlinear trend): The linearity assumption is violated. The relationship between your predictors and the response isn't purely linear. You may need to add a higher-order term (like for a quadratic relationship) or switch to a different functional form (logarithmic, exponential, etc.).
- Funnel or cone shape (residuals spread out or narrow as predicted values increase): This is heteroscedasticity, meaning the variance of errors isn't constant. More on this below.
- Cyclical or wave-like pattern: Often shows up when residuals are plotted against observation order. This suggests the errors aren't independent, which is common in time-series data.
Outliers and Variable-Specific Patterns
Outliers are residuals that fall far from the rest of the data. A single extreme point can pull the regression line toward it, distorting your coefficient estimates. When you spot an outlier, investigate it: Is it a data entry error? A genuinely unusual observation? Depending on the answer, you might correct it, remove it, or flag it and report results with and without it.
Variable-specific patterns are also informative. If the residuals show a distinct trend when plotted against one particular predictor, that predictor's effect may not be captured well by the current model. Common fixes include:
- Adding interaction terms (e.g., ) if the effect of one predictor depends on the level of another
- Applying transformations to the predictor (log, square root) to linearize the relationship
For example, if residuals show a clear split when plotted against a categorical variable like treatment group, the model may need an interaction between that variable and another predictor to properly capture the effect.
Normality of Residuals

Assessing Normality Visually
The normality assumption states that residuals are normally distributed with a mean of zero. Two visual tools help you check this:
- Histogram or density plot of residuals — You're looking for a roughly symmetric, bell-shaped distribution. Obvious skewness or heavy tails are red flags.
- Normal probability plot (Q-Q plot) — This plots the quantiles of your residuals against the quantiles of a theoretical normal distribution. If residuals are normal, the points will fall close to a straight diagonal line. Systematic departures from that line (S-curves, bowing) indicate non-normality.
The Q-Q plot is generally more informative than the histogram, especially with smaller samples where histograms can look choppy.
Formal Tests and Implications of Violations
Two common formal tests for normality:
- Shapiro-Wilk test — generally preferred for small to moderate sample sizes; more powerful in most situations
- Kolmogorov-Smirnov test — a more general test, but less powerful for detecting departures from normality
For both tests, a p-value above your significance level (typically 0.05) means you fail to reject the null hypothesis that residuals are normal.
One important nuance: normality violations matter less as your sample size grows. The Central Limit Theorem ensures that the sampling distributions of your regression coefficients become approximately normal in large samples, even if the residuals themselves aren't perfectly normal. So with a large dataset, mild non-normality is usually not a serious concern.
If non-normality is severe, transforming the dependent variable (e.g., taking or ) often helps pull the residuals closer to a normal distribution.
Homoscedasticity in Regression Models
Definition and Consequences of Heteroscedasticity
Homoscedasticity means the variance of the residuals is the same regardless of the predicted value or the level of any predictor. When this assumption fails, you have heteroscedasticity.
Heteroscedasticity doesn't bias your coefficient estimates themselves, but it does bias the standard errors of those estimates. That's a problem because standard errors feed directly into t-tests, p-values, and confidence intervals. With biased standard errors, you might conclude a predictor is significant when it isn't (or vice versa).
Detecting and Addressing Heteroscedasticity
Visual detection: Plot residuals against predicted values. A fan or cone shape where the spread of residuals widens (or narrows) as increases is the classic sign.
Formal tests:
- Breusch-Pagan test — regresses the squared residuals on the predictors and tests whether the predictors explain significant variation in the residual variance. A small p-value indicates heteroscedasticity.
- White test — a more general version that also includes cross-products and squared terms of the predictors, so it can detect more complex forms of heteroscedasticity.
Remedies when heteroscedasticity is present:
- Weighted Least Squares (WLS) — gives less weight to observations with higher variance, producing more efficient estimates
- Robust (heteroscedasticity-consistent) standard errors — keeps OLS estimates but corrects the standard errors so that inference is valid; often the simplest fix
- Variable transformations — applying or to the dependent variable can stabilize the variance across the range of predicted values
The choice among these depends on the severity of the problem and your goals. Robust standard errors are a common default because they don't require you to specify the exact form of the heteroscedasticity.