Fiveable

📉Statistical Methods for Data Science Unit 6 Review

QR code for Statistical Methods for Data Science practice questions

6.1 One-way ANOVA

6.1 One-way ANOVA

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
📉Statistical Methods for Data Science
Unit & Topic Study Guides

One-way ANOVA compares means across multiple groups, extending the t-test's capabilities. It analyzes between-group and within-group variability to determine if differences among groups are statistically significant. This powerful tool helps researchers understand the impact of categorical variables on continuous outcomes.

ANOVA uses the F-statistic, which compares between-group and within-group variability. By calculating sums of squares, mean squares, and degrees of freedom, researchers can assess if observed differences are due to chance or represent real effects. Post-hoc tests like Tukey's HSD pinpoint specific group differences.

ANOVA Basics

Overview of Analysis of Variance (ANOVA)

  • Analysis of Variance (ANOVA) is a statistical method used to compare means across multiple groups or conditions
  • Determines if there are significant differences between the means of three or more independent groups
  • Extends the independent samples t-test, which can only compare means between two groups, to allow for comparisons among multiple groups
  • Commonly used in experimental research designs to analyze the effect of categorical independent variables on a continuous dependent variable

Components of Variability

  • Between-group variability measures the differences between the group means and the grand mean (mean of all observations)
    • Reflects the effect of the independent variable on the dependent variable
    • Larger between-group variability suggests that the independent variable has a significant effect on the dependent variable
  • Within-group variability measures the differences between individual scores and their respective group means
    • Represents the random variability or individual differences within each group
    • Smaller within-group variability indicates that the groups are more homogeneous and the independent variable has a stronger effect

F-Statistic and Degrees of Freedom

  • F-statistic is the ratio of the between-group variability to the within-group variability
    • Calculated as: F=Betweengroup variabilityWithingroup variabilityF = \frac{Between-group\ variability}{Within-group\ variability}
    • Larger F-values suggest that the between-group variability is greater than the within-group variability, indicating a significant effect of the independent variable
  • Degrees of freedom (df) represent the number of independent pieces of information used to calculate the statistic
    • Between-group df = number of groups - 1
    • Within-group df = total number of observations - number of groups
    • Total df = total number of observations - 1
Overview of Analysis of Variance (ANOVA), Using Excel to Explore the Effects of Assumption Violations on One-Way Analysis of Variance ...

ANOVA Calculations

Sum of Squares and Mean Square

  • Sum of squares (SS) measures the total variability in the data
    • Total SS = (XXˉ)2\sum(X - \bar{X})^2, where XX is each individual score and Xˉ\bar{X} is the grand mean
    • Between-group SS = nj(XˉjXˉ)2\sum n_j(\bar{X}_j - \bar{X})^2, where njn_j is the sample size of group jj, Xˉj\bar{X}_j is the mean of group jj, and Xˉ\bar{X} is the grand mean
    • Within-group SS = Total SS - Between-group SS
  • Mean square (MS) is the sum of squares divided by the respective degrees of freedom
    • Between-group MS = Between-group SS / Between-group df
    • Within-group MS = Within-group SS / Within-group df

Effect Size and Eta-Squared

  • Effect size measures the magnitude of the difference between groups
    • Indicates the practical significance of the results, beyond just statistical significance
    • Eta-squared (η2\eta^2) is a common effect size measure for ANOVA
      • Calculated as: η2=Betweengroup SSTotal SS\eta^2 = \frac{Between-group\ SS}{Total\ SS}
      • Ranges from 0 to 1, with larger values indicating a stronger effect
      • Interpretation guidelines: small effect (0.01), medium effect (0.06), and large effect (0.14)
Overview of Analysis of Variance (ANOVA), PSPP for Beginners

Hypothesis Testing

Null and Alternative Hypotheses

  • Null hypothesis (H0H_0) states that there is no significant difference between the group means
    • Example: H0:μ1=μ2=μ3H_0: \mu_1 = \mu_2 = \mu_3, where μ1\mu_1, μ2\mu_2, and μ3\mu_3 are the population means for groups 1, 2, and 3, respectively
  • Alternative hypothesis (H1H_1 or HaH_a) states that at least one group mean is significantly different from the others
    • Example: H1:At least one μi is differentH_1: \text{At least one }\mu_i\text{ is different}, where ii represents the group index

Post-hoc Tests and Tukey's HSD

  • Post-hoc tests are conducted after a significant ANOVA result to determine which specific group means differ from each other
    • Necessary because ANOVA only indicates that there is a significant difference, but does not specify which groups differ
  • Tukey's Honestly Significant Difference (HSD) test is a commonly used post-hoc test
    • Compares all possible pairs of group means while controlling for the familywise error rate
    • Calculates the HSD statistic: HSD=qWithingroup MSnHSD = q\sqrt{\frac{Within-group\ MS}{n}}, where qq is the studentized range statistic, and nn is the sample size per group
    • If the absolute difference between two group means is greater than the HSD, the means are considered significantly different
Pep mascot
Upgrade your Fiveable account to print any study guide

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Click below to go to billing portal → update your plan → choose Yearly → and select "Fiveable Share Plan". Only pay the difference

Plan is open to all students, teachers, parents, etc
Pep mascot
Upgrade your Fiveable account to export vocabulary

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Plan is open to all students, teachers, parents, etc
report an error
description

screenshots help us find and fix the issue faster (optional)

add screenshot

2,589 studying →