Fiveable

🎲Intro to Statistics Unit 13 Review

QR code for Intro to Statistics practice questions

13.2 The F Distribution and the F-Ratio

13.2 The F Distribution and the F-Ratio

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🎲Intro to Statistics
Unit & Topic Study Guides

The F Distribution

The F distribution is a right-skewed probability distribution used in ANOVA to determine whether the means of three or more groups are significantly different. It works by comparing two types of variance: the spread between group means and the spread within each group. If the between-group variance is much larger than the within-group variance, that's evidence the groups aren't all the same.

How the F-Ratio Works

The F-ratio is the core test statistic in one-way ANOVA. It asks a simple question: is the variation between groups large relative to the variation inside each group?

F=MSbetweenMSwithinF = \frac{MS_{between}}{MS_{within}}

  • MSbetweenMS_{between} (mean square between groups) captures how spread out the group means are from each other.
  • MSwithinMS_{within} (mean square within groups) captures the typical variability of individual observations around their own group mean.

When the null hypothesis is true (all group means are equal), both the numerator and denominator estimate the same underlying variance, so the F-ratio hovers around 1. When group means genuinely differ, MSbetweenMS_{between} grows larger than MSwithinMS_{within}, pushing the F-ratio well above 1.

F-ratio calculation from variances, R Tutorial Series: R Tutorial Series: One-Way Omnibus ANOVA

Calculating the F-Ratio Step by Step

Here's the full process for computing the F-ratio in a one-way ANOVA:

Step 1: Find the grand mean. The grand mean (xˉ\bar{x}) is the mean of all observations across every group.

Step 2: Calculate SSbetweenSS_{between}.

SSbetween=i=1kni(xˉixˉ)2SS_{between} = \sum_{i=1}^{k} n_i(\bar{x}_i - \bar{x})^2

  • kk = number of groups
  • nin_i = sample size of group ii
  • xˉi\bar{x}_i = mean of group ii

This measures how far each group mean is from the grand mean, weighted by group size.

Step 3: Calculate SSwithinSS_{within}.

SSwithin=i=1kj=1ni(xijxˉi)2SS_{within} = \sum_{i=1}^{k} \sum_{j=1}^{n_i} (x_{ij} - \bar{x}_i)^2

  • xijx_{ij} = the jj-th observation in group ii

This measures how much individual data points vary around their own group mean.

Step 4: Compute degrees of freedom.

  • dfbetween=k1df_{between} = k - 1
  • dfwithin=Nkdf_{within} = N - k, where NN is the total number of observations across all groups

Step 5: Compute mean squares.

  • MSbetween=SSbetweendfbetweenMS_{between} = \frac{SS_{between}}{df_{between}}
  • MSwithin=SSwithindfwithinMS_{within} = \frac{SS_{within}}{df_{within}}

Step 6: Compute the F-ratio.

F=MSbetweenMSwithinF = \frac{MS_{between}}{MS_{within}}

F-ratio calculation from variances, Test of Two Variances · Statistics

Interpreting the F-Statistic

Once you have the F-ratio, you need to decide whether it's large enough to reject the null hypothesis. There are two equivalent ways to do this:

Using a critical value: Look up the critical F-value from an F-distribution table using dfbetweendf_{between} and dfwithindf_{within} at your chosen significance level (typically α=0.05\alpha = 0.05).

  • If your F-ratio > critical value, reject H0H_0. At least one group mean is significantly different.
  • If your F-ratio ≤ critical value, fail to reject H0H_0. There isn't enough evidence to say the group means differ.

Using a p-value: The p-value tells you the probability of getting an F-ratio at least as large as yours if the null hypothesis were true.

  • If p-value < α\alpha, reject H0H_0.
  • If p-value ≥ α\alpha, fail to reject H0H_0.

One thing to keep in mind: the F-distribution is always right-skewed and only takes positive values, so ANOVA is always a right-tailed test. You're only looking at whether the F-ratio is unusually large, not unusually small.

Hypothesis Testing in ANOVA

Every one-way ANOVA follows the same hypothesis structure:

  • Null hypothesis (H0H_0): All group population means are equal (μ1=μ2==μk\mu_1 = \mu_2 = \dots = \mu_k).
  • Alternative hypothesis (HaH_a): At least one group mean differs from the others.

Notice that the alternative doesn't say which mean is different or how many differ. ANOVA only tells you that at least one group stands apart. If you reject H0H_0, you'd need follow-up tests (called post-hoc tests) to figure out which specific groups differ.

How Sample Size Affects the F-Distribution

The shape of the F-distribution depends entirely on its two degrees of freedom: dfbetweendf_{between} and dfwithindf_{within}.

  • With small degrees of freedom, the F-distribution is heavily right-skewed with a long tail. This means the critical value for rejection is relatively high, making it harder to reach significance.
  • As degrees of freedom increase, the distribution becomes more symmetric and concentrated around 1. The critical value drops, so smaller differences between groups can reach significance.

Since dfwithin=Nkdf_{within} = N - k, increasing your total sample size directly increases dfwithindf_{within}. This has two practical effects:

  • MSwithinMS_{within} becomes a more precise estimate of within-group variance.
  • The F-distribution tightens, so the test becomes more sensitive to real differences between group means.

In short, larger samples give you more statistical power to detect genuine differences. With very small samples, only large differences between groups will produce a significant F-ratio.