ANCOVA in linear modeling
ANCOVA (Analysis of Covariance) combines the group-comparison logic of ANOVA with regression on one or more continuous covariates. The result is a model that compares group means on an outcome variable after adjusting for covariates that might otherwise confound the comparison. This matters most when random assignment isn't possible and groups already differ on characteristics that influence the dependent variable.
Purpose and applications of ANCOVA
ANCOVA serves two related goals: it removes bias from group comparisons by statistically controlling for covariates, and it increases statistical power by explaining away within-group variability that would otherwise inflate the error term.
- When the covariate accounts for a meaningful share of within-group variance, the residual error shrinks, making it easier to detect true group differences.
- ANCOVA is especially valuable in quasi-experimental designs where intact groups (classrooms, clinics, regions) are compared and pre-existing differences are likely.
Common applications include:
- Comparing treatment effects while controlling for baseline scores (e.g., adjusting post-test means for pre-test performance)
- Examining group differences while accounting for demographic confounds such as age or years of education
- Increasing power in randomized experiments where a strong covariate is available (e.g., using baseline blood pressure in a drug trial)
Advantages of using ANCOVA
- Adjusted group comparisons. ANCOVA shifts each group's mean to reflect what it would be if all groups had the same covariate value, producing a fairer comparison.
- Greater statistical power. Removing covariate-related variance from the error term lowers , which increases the -ratio for the group effect.
- Flexibility. You can study the effect of a categorical independent variable while simultaneously accounting for one or more continuous covariates, all within a single linear model.
Components of ANCOVA
Variables in the ANCOVA model
- Dependent variable (Y): The continuous outcome you want to compare across groups (e.g., exam score, systolic blood pressure).
- Independent variable (X): The categorical grouping factor with two or more levels (e.g., treatment vs. control, three different curricula). In the linear model this is represented by dummy or effect codes.
- Covariate (C): A continuous variable related to but not of primary interest. Pre-test scores and age are classic examples. The covariate is included so the model can adjust each group's mean to a common covariate value (typically the grand mean of ).

ANCOVA model equation and parameters
For a single-factor design with one covariate, the model is:
- is the intercept (predicted when is at its reference level and equals zero, or the grand mean depending on coding).
- is the effect of the grouping variable, representing the adjusted difference between groups after controlling for the covariate.
- is the regression slope for the covariate, capturing how much changes per one-unit increase in , pooled across groups.
- is the residual error, assumed .
With more than two groups, expands into a set of dummy-coded (or effect-coded) terms, one for each degree of freedom among the groups.
Adjusted means (also called least-squares means or estimated marginal means) are the predicted values of for each group when the covariate is held at its grand mean. These are the quantities you actually compare in ANCOVA, not the raw group means.
Assumptions of ANCOVA
ANCOVA inherits the standard linear-model assumptions and adds one that is unique to the covariate-by-group structure.
Independence and normality assumptions
- Independence of observations. Each observation must be independent of every other observation. Clustering (e.g., students nested in classrooms) violates this assumption and inflates Type I error because standard errors become too small. If clustering is present, multilevel modeling is a better choice.
- Normality of residuals. The residuals should be approximately normally distributed. You can check this with a Q-Q plot or a Shapiro-Wilk test on the residuals. With large samples the -test is fairly robust to moderate non-normality, but with small samples violations can distort -values and confidence intervals.
Homogeneity and linearity assumptions
- Homogeneity of variance (homoscedasticity). The variance of the residuals should be roughly equal across all groups. Levene's test or a residuals-vs.-fitted plot can diagnose this. Heteroscedasticity biases standard errors, which in turn makes -tests and confidence intervals unreliable.
- Linearity. The relationship between the covariate and the dependent variable must be linear within each group. A scatterplot of vs. (color-coded by group) is the simplest diagnostic. If the relationship is curved, the model will mis-estimate adjusted means, potentially reversing the direction of group differences.

Additional assumptions and considerations
- Homogeneity of regression slopes. This is the assumption unique to ANCOVA. The slope relating the covariate to the outcome must be the same in every group. In model terms, there should be no interaction. If the slopes differ across groups, a single pooled slope cannot correctly adjust the means, and the adjusted group differences will depend on where along the covariate you evaluate them. You can test this by fitting a model that includes the interaction term and checking whether it is significant.
If the homogeneity-of-slopes assumption fails, standard ANCOVA is not appropriate. Consider instead a model that includes the interaction (sometimes called the Johnson-Neyman approach) or use separate regression lines for each group.
- Reliability of the covariate. Measurement error in the covariate attenuates its slope ( is biased toward zero), which means the adjustment is incomplete. The group effect estimate then absorbs leftover covariate-related variance, leading to biased adjusted means. Using a highly reliable measure for the covariate (or correcting for attenuation) reduces this problem.
ANCOVA appropriateness
Research question and data requirements
Before choosing ANCOVA, confirm three things:
- Your research question asks whether group means on a continuous outcome differ after controlling for one or more continuous covariates.
- The independent variable is categorical (two or more groups) and the dependent variable is continuous.
- You have identified at least one covariate that is theoretically related to the outcome and measured on a continuous scale. The covariate should be measured before the treatment or at least not be affected by it; otherwise the adjustment can remove part of the treatment effect itself.
Checking assumptions and considering alternatives
Work through the assumptions in a logical order:
- Independence — consider the study design; no statistical test can fully verify this.
- Linearity — plot vs. within each group.
- Homogeneity of regression slopes — fit the interaction model and test the term.
- Normality of residuals — inspect a Q-Q plot of residuals from the ANCOVA model.
- Homogeneity of variance — check a residuals-vs.-fitted plot or run Levene's test.
If assumptions are severely violated, alternatives include:
- Multiple regression with interaction terms when slopes are unequal across groups.
- Robust or nonparametric methods when normality or homoscedasticity fails badly.
- Multilevel (mixed) models when observations are clustered.
Sample size and power considerations
Statistical power in ANCOVA depends on several factors:
- Number of groups and covariate strength. A covariate that correlates strongly with removes more error variance, boosting power. The effective error variance is approximately , where is the within-group correlation between the covariate and the outcome.
- Sample size per group. Each additional covariate consumes a degree of freedom, so adding weak covariates can actually reduce power. Include only covariates with a meaningful relationship to .
- Effect size. Consider both statistical significance and practical significance when planning sample size. A statistically significant but trivially small adjusted mean difference may not be meaningful.
A formal power analysis (using software such as G*Power or simulation) that accounts for the expected , the number of groups, and the target effect size is the best way to determine the required sample size before data collection.