Test of Two Variances
The F-test for two variances lets you determine whether two populations have the same spread (variance). This matters because many statistical procedures, including pooled t-tests and ANOVA, assume equal variances across groups. If that assumption fails, your results from those procedures can be unreliable.

F-Ratio for Variance Comparison
The F-ratio compares the variances of two independent samples drawn from normally distributed populations.
- = sample variance of the first sample
- = sample variance of the second sample
By convention, you place the larger sample variance in the numerator and the smaller in the denominator. This guarantees the F-ratio is always , which simplifies looking up critical values.
Each sample contributes its own degrees of freedom:
- Numerator degrees of freedom: , where is the sample size associated with the larger variance
- Denominator degrees of freedom: , where is the sample size associated with the smaller variance
Be careful here: always belongs to whichever sample you placed in the numerator, not necessarily "sample 1" from the problem statement. Mixing these up is a common mistake.
Interpretation of F-Ratio Results
The hypotheses for this test are:
- Null hypothesis : (the population variances are equal, sometimes called homogeneity of variances)
- Alternative hypothesis : (the population variances are not equal)
An F-ratio close to 1 means the two sample variances are similar, which supports . An F-ratio much larger than 1 suggests the variances differ meaningfully.
Decision rule:
- Choose a significance level (typically ).
- Find the critical F-value from an F-distribution table using and .
- If your calculated F-ratio > critical F-value, reject and conclude the population variances are not equal.
- If your calculated F-ratio critical F-value, fail to reject . You don't have enough evidence to say the variances differ.
Note that when you always place the larger variance in the numerator, you're effectively running a one-tailed test in the right tail. If the research question calls for a true two-tailed test (testing whether either variance could be larger), you need to halve before looking up the critical value, or use the p-value approach and compare to .

Appropriateness of the F-Test
The F-test for equality of two variances requires three conditions:
- The two samples are independent of each other
- Both populations are normally distributed
- Sample sizes are relatively small (generally )
The biggest limitation of this test is that it's very sensitive to non-normality. Even mild skewness or heavy tails can inflate the Type I error rate, especially with small samples. This is different from the t-test, which is fairly robust to non-normality. With the variance F-test, the normality assumption really matters.
If you suspect non-normality, consider alternatives like Levene's test or the Brown-Forsythe test, both of which are more resistant to departures from normality.
For larger samples (), the sampling distribution of variances becomes more stable, making the F-test somewhat more robust. Still, you should check normality before applying the test using:
- Graphical methods: histograms, Q-Q plots
- Formal tests: Shapiro-Wilk test
Statistical Errors and Power
Two types of errors can occur with any hypothesis test, including this one:
- Type I error: Rejecting when the population variances actually are equal. The probability of this equals your significance level .
- Type II error: Failing to reject when the population variances actually differ. The probability of this is denoted .
Statistical power is the probability of correctly rejecting a false . Power increases when:
- Sample sizes are larger
- The true difference between the variances (effect size) is larger
- You use a higher significance level (though this also raises Type I error risk)
For the variance F-test specifically, power tends to be low unless the difference in variances is quite large or sample sizes are substantial. This is worth keeping in mind: a "fail to reject" result doesn't necessarily mean the variances are equal; you may simply lack the power to detect a real difference.