Test for Independence

In AP Statistics, the chi-square test for independence uses one sample classified by two categorical variables in a two-way table to test whether the variables are associated in the population, comparing observed counts to expected counts with the chi-square statistic (Topics 8.4 and 8.6).

Verified for the 2027 AP Statistics examLast updated June 2026

What is Test for Independence?

The test for independence answers one question. You took one sample from one population, sorted each individual by two categorical variables (say, exercise frequency and stress level), and now you want to know if those variables are actually related in the population or if the pattern in your two-way table is just sampling noise.

The logic is a comparison between what you saw and what "no relationship" would look like. The null hypothesis says the two variables are independent. Under that assumption, you calculate the expected count for each cell using (row total)(column total)/table total (LO 8.4.A). Then the chi-square statistic, χ² = Σ(Observed − Expected)²/Expected, measures the total gap between reality and the no-association model (LO 8.6.A). Degrees of freedom are (rows − 1)(columns − 1), and the p-value is the proportion of a chi-square distribution at or beyond your statistic. A tiny p-value means your table looks too lopsided to be a coincidence, so you reject independence and conclude the variables are associated.

Why Test for Independence matters in AP Statistics

This test lives in Unit 8: Inference for Categorical Data: Chi-Square, specifically Topics 8.4 and 8.6. It carries four learning objectives: calculating expected counts (AP Stats 8.4.A), computing the chi-square statistic and degrees of freedom (AP Stats 8.6.A), finding the p-value (AP Stats 8.6.B), interpreting it (AP Stats 8.6.C), and justifying a conclusion about the sampled population (AP Stats 8.6.D). It's also the payoff of a long thread in the course. Way back in Unit 2 you described association between categorical variables with two-way tables and conditional proportions. The test for independence is how you finally decide whether that association is statistically significant, not just visually suggestive. On the exam, it's a regular star of the inference FRQ.

How Test for Independence connects across the course

Chi-Square Test for Homogeneity (Unit 8)

Same formula, same degrees of freedom, same mechanics, different sampling design. Homogeneity uses separate samples from multiple populations to compare one variable's distribution; independence uses one sample classified by two variables. The data collection method, not the table, tells you which test you're running.

Contingency Tables and Association (Units 1-2)

In Unit 2 you eyeballed two-way tables and conditional distributions to describe association. The test for independence is the inference upgrade. It turns 'these conditional proportions look different' into a formal yes-or-no decision with a p-value.

Null Hypothesis and the Logic of Significance Tests (Unit 6)

The reasoning here is the same machinery you learned with proportions. Assume the null (independence), measure how surprising your data is under that assumption, and reject if the p-value falls below α. Only the statistic and distribution changed.

Expected Counts in Two-Way Tables (Unit 8)

Topic 8.4 is the setup step for this test. Expected counts are what each cell would hold if independence were exactly true, and they double as the condition check, since all expected counts must be at least 5 before you trust the chi-square approximation.

Is Test for Independence on the AP Statistics exam?

Multiple-choice questions hit the mechanics. You might compute an expected count from row and column totals, find degrees of freedom from table dimensions (a 2×5 table gives df = 4), reason about how a cell's contribution (O − E)²/E changes when E changes, or interpret a p-value like 0.056 correctly as the probability of a statistic at least as extreme as the one observed, assuming independence is true. FRQs ask for the full test. The 2026 exam, for example, gave a two-way table of professional athletes' age-groups and sport played and asked whether there is an association. You're expected to name the test, state hypotheses in terms of association or independence, verify conditions (random sample, expected counts ≥ 5), show χ² and df, report the p-value, and write a conclusion in context that compares the p-value to α. Scorers reward the conclusion sentence, so practice phrasing like 'we have convincing evidence of an association between X and Y in this population.'

Test for Independence vs Chi-Square Test for Homogeneity

These two tests are computationally identical, which is exactly why the AP exam loves asking you to tell them apart. Independence means one sample from one population, with each individual classified by two variables (one survey asking both exercise habits and stress level). Homogeneity means two or more separate samples from different populations, comparing the distribution of one variable across them (separate samples of younger and older gym members asked the same question). Look at how the data were collected, then state hypotheses to match: 'no association between the variables' for independence, 'same distribution across populations' for homogeneity.

Key things to remember about Test for Independence

  • The chi-square test for independence checks whether two categorical variables are associated in a population, using one sample classified by both variables in a two-way table.

  • Expected counts come from the formula (row total × column total) / table total, and every expected count must be at least 5 for the test to be valid.

  • The test statistic is χ² = Σ(Observed − Expected)²/Expected with degrees of freedom equal to (rows − 1)(columns − 1).

  • The p-value is the probability, assuming the variables really are independent, of getting a chi-square statistic as large or larger than the one you observed.

  • If the p-value is below α, reject the null and conclude there is convincing evidence of an association; if not, you fail to reject and cannot conclude an association exists.

  • Independence and homogeneity use the exact same calculations, so the only way to pick the right test is to identify whether you have one sample (independence) or multiple samples (homogeneity).

Frequently asked questions about Test for Independence

What is the chi-square test for independence in AP Stats?

It's a significance test that uses one sample, classified by two categorical variables in a two-way table, to decide whether those variables are associated in the population. It compares observed counts to the counts you'd expect if the variables were independent, using the chi-square statistic.

How is the test for independence different from the test for homogeneity?

The math is identical; the design is not. Independence uses one sample where each person is classified by two variables, while homogeneity compares one variable's distribution across two or more separately sampled populations. Identify the sampling method before naming the test on an FRQ.

Does rejecting the null hypothesis prove the two variables cause each other?

No. Rejecting the null gives convincing evidence of an association, but association is not causation, especially since this test is usually run on survey or observational data. Only a randomized experiment supports a causal conclusion.

How do you find degrees of freedom for a test of independence?

Multiply (number of rows − 1) by (number of columns − 1). A 3×3 table of exercise frequency versus stress level has df = 4, and a 2×5 table has df = 4 as well.

What are the conditions for a chi-square test for independence?

You need a random sample, expected counts of at least 5 in every cell (calculated as row total × column total / table total), and independent observations, which usually means checking the 10% condition when sampling without replacement.