The normality assumption states that the residuals (errors) of a linear regression model follow a normal distribution with mean zero. This doesn't mean your raw data needs to be normal; it's specifically about the distribution of residuals.

The homoscedasticity assumption requires that the variance of residuals stays constant across all levels of the independent variable(s). In other words, the vertical "spread" of residuals should look roughly the same whether you're looking at low, medium, or high fitted values. No funneling, no fanning out.

Violations of either assumption can produce biased or inefficient estimates of regression coefficients and standard errors, which undermines the trustworthiness of your entire model.

Implications of Violated Assumptions

Non-normality primarily threatens the validity of hypothesis tests and confidence intervals, since t-tests and F-tests assume normally distributed errors. If residuals are heavily skewed or have extreme outliers, the p-values and confidence intervals for your regression coefficients may be unreliable. That said, with large samples, the Central Limit Theorem provides some protection: OLS estimates are still approximately normal even when residuals aren't perfectly so.

Heteroscedasticity (non-constant variance) distorts standard errors. If the variance of residuals grows with higher values of the predictor, standard errors tend to be underestimated. This produces p-values that look more significant than they should be, potentially leading you to declare a relationship significant when it isn't.

Assessing Residual Normality

Understanding the Assumptions, data visualization - Help interpreting Residuals vs Fitted Plots - Cross Validated

Graphical Methods

Visual inspection is usually the first step. Two plots are standard:

Histogram of residuals: Should look roughly bell-shaped and symmetric around zero. Watch for heavy skew or multiple peaks.
Normal probability plot (Q-Q plot): Plots the quantiles of your residuals against the quantiles of a theoretical normal distribution. If residuals are normally distributed, the points fall close to a straight diagonal line. An S-shaped curve suggests heavy tails or skewness; points curving away at the ends indicate outliers or a distribution with fatter (or thinner) tails than normal.

Q-Q plots are generally more informative than histograms because histograms are sensitive to bin width, especially with smaller samples.

Statistical Tests

Shapiro-Wilk test: The most commonly used formal test. The null hypothesis is that residuals are normally distributed. A p-value below 0.05 leads you to reject normality. This test works well for small to moderate sample sizes but can flag trivial departures from normality in very large samples.
Kolmogorov-Smirnov test: Compares the empirical cumulative distribution function (CDF) of residuals to the theoretical normal CDF. Less powerful than Shapiro-Wilk for detecting normality violations in practice.
Skewness and kurtosis: Skewness near zero indicates symmetry; positive values mean a right tail, negative values a left tail. Kurtosis near 3 (or "excess kurtosis" near 0) matches a normal distribution. Values well above 3 indicate heavier tails than normal.

A practical note: always look at the graphical evidence alongside the test results. With large samples, formal tests can reject normality for deviations so small they don't meaningfully affect your inferences.

Detecting Heteroscedasticity

Understanding the Assumptions, Residual Diagnostics and Homogeneity of variances in linear mixed model - Cross Validated

Residual Plots

Plot residuals (or standardized residuals) on the y-axis against fitted values on the x-axis. You're looking for patterns in the spread:

Constant band of points: Suggests homoscedasticity. This is what you want.
Fan or funnel shape: The spread of residuals widens (or narrows) as fitted values increase. This is the classic sign of heteroscedasticity.
Bow-tie or diamond shape: Variance is larger in the middle range of fitted values. Less common, but still a violation.

If you have multiple predictors, also plot residuals against each individual predictor to identify which variable might be driving the non-constant variance.

Statistical Tests

Breusch-Pagan test: Regresses the squared residuals on the original predictors. The null hypothesis is homoscedasticity. A p-value below 0.05 suggests heteroscedasticity is present. This test assumes a linear relationship between variance and the predictors.
White's test: A more general alternative that doesn't assume a specific functional form for the heteroscedasticity. It includes squared terms and cross-products of predictors, so it can detect more complex patterns of non-constant variance. The tradeoff is reduced power due to the additional terms.
Goldfeld-Quandt test: Splits the data into two subsamples (typically based on the values of a suspect predictor), fits the regression separately to each, and compares the residual variances using an F-test. If the variances differ significantly, heteroscedasticity is present with respect to that variable.

Consequences of Violated Assumptions

Impact on Coefficient Estimates and Inferences

Under non-normality, OLS coefficient estimates remain unbiased (the Gauss-Markov theorem doesn't require normality for unbiasedness). However, the estimates may no longer be the most efficient, and the standard hypothesis tests lose their exact distributional justification. In severe cases with small samples, confidence intervals and p-values can be substantially misleading.

Under heteroscedasticity, OLS estimates are still unbiased but no longer BLUE (Best Linear Unbiased Estimators). The real damage is to standard errors: they become biased, which means your t-statistics, F-statistics, confidence intervals, and p-values are all unreliable. You might conclude a predictor is significant when it isn't, or miss a genuinely significant relationship.

Remedial Measures

When diagnostics reveal violations, several remedies are available:

Variance-stabilizing transformations: Applying a logarithmic, square root, or Box-Cox transformation to the dependent variable can sometimes stabilize the variance and improve normality simultaneously. A log transformation is especially useful when the variance grows proportionally with the mean.
Weighted Least Squares (WLS): If you can model or estimate how the variance changes, WLS down-weights observations with higher variance and up-weights those with lower variance, producing more efficient estimates.
Robust (Huber-White) standard errors: Also called heteroscedasticity-consistent standard errors. These adjust the standard errors to account for heteroscedasticity without changing the coefficient estimates themselves. This is often the simplest fix when heteroscedasticity is present but you want to keep the OLS framework.
Generalized Least Squares (GLS): A broader framework that handles both heteroscedasticity and correlated errors by transforming the model to restore the standard assumptions.

The choice of remedy depends on the severity of the violation and your goals. For mild heteroscedasticity with a reasonably large sample, robust standard errors are often sufficient. For severe non-normality or strong heteroscedasticity, a transformation or WLS may be more appropriate.

2,589 studying →