One-Way ANOVA

One-way ANOVA in Statistical Software
One-way ANOVA compares the means of three or more groups to determine whether any of them differ significantly. Think of it as an extension of the two-sample t-test, but for situations where you have multiple groups (different treatment conditions, age brackets, etc.).
The hypotheses are:
- : All group means are equal ()
- : At least one group mean differs from the others
Before running the test, three assumptions must hold:
- Independence: Observations are independent of each other
- Normality: The residuals (not the raw data) are approximately normally distributed
- Equal variances (homoscedasticity): The population variances across groups are roughly equal
Here's how to conduct a one-way ANOVA in statistical software:
- Input your data so each row is one observation, with a column for the response variable (continuous) and a column for the group variable (categorical).
- Specify the dependent variable (your measured outcome) and the independent variable (the grouping factor).
- Run the one-way ANOVA, which produces an omnibus F-test.
- Check assumptions using residual plots, a normality test (like Shapiro-Wilk), and a variance homogeneity test (like Levene's test).
- If assumptions hold, interpret the output. If they don't, consider data transformations or a non-parametric alternative like the Kruskal-Wallis test.

Interpretation of ANOVA Results
The ANOVA output centers on an ANOVA table that displays sums of squares, degrees of freedom, mean squares, the F-statistic, and the p-value. Here's what each piece tells you.
F-statistic: This is the ratio of between-group variability to within-group variability. A larger F means the group means are more spread out relative to the variation inside each group. An F near 1 suggests the groups don't differ much.
Degrees of freedom determine the shape of the F-distribution you're comparing against:
- (numerator) = number of groups
- (denominator) = total sample size number of groups
P-value: The probability of observing an F-statistic this large (or larger) if were true. If the p-value is less than your significance level (typically 0.05), you reject and conclude that at least one group mean differs.
Grand mean is the overall average of all observations across every group. The ANOVA essentially asks whether individual group means deviate from this grand mean more than random chance would predict.
Effect size tells you how large the differences are, not just whether they exist. Common measures include:
- (eta-squared): The proportion of total variability explained by group membership. For example, means 14% of the variation in your outcome is accounted for by group differences.
- Cohen's f: Another standardized measure where 0.10, 0.25, and 0.40 correspond roughly to small, medium, and large effects.
A significant p-value with a tiny effect size means the difference is real but may not be practically meaningful.

Post-hoc Tests for Group Comparisons
A significant F-test tells you something differs, but not what. Post-hoc tests fill that gap by making pairwise comparisons between specific group means while controlling for the inflated Type I error rate that comes with multiple testing.
Tukey's HSD (Honestly Significant Difference) is the most common post-hoc test:
- It compares every possible pair of group means.
- It computes an HSD critical value using the studentized range distribution.
- If the absolute difference between two group means exceeds the HSD value, those means are significantly different.
- It controls the family-wise error rate, keeping the overall chance of any false positive at your chosen .
For example, if you ran ANOVA on test scores across low, medium, and high anxiety groups, Tukey's HSD might reveal that the high-anxiety group scored significantly lower than both other groups, while low and medium did not differ from each other.
Alternative post-hoc tests serve different purposes:
- Bonferroni correction: Divides by the number of comparisons. Simple but conservative, especially with many groups.
- Dunnett's test: Compares each group to a single control group rather than all pairs. Useful in experiments with a clear baseline condition.
- Scheffé's test: More conservative than Tukey's but allows testing any linear contrast among means, not just pairwise differences.
Always interpret post-hoc results alongside the overall ANOVA. Report which specific pairs differ and by how much.
Advanced ANOVA Concepts
These topics go beyond one-way ANOVA but are worth knowing at the honors level:
- Factorial ANOVA extends the one-way design to examine two or more independent variables simultaneously (e.g., both treatment type and dosage level).
- Interaction effects occur when the effect of one independent variable on the outcome depends on the level of another. For instance, a drug might improve scores in younger patients but not older ones.
- Multiple comparisons is the broader term for any method that controls Type I error when you test many pairs or contrasts at once. Tukey's, Bonferroni, and Scheffé's are all examples.