Chi-square test for independence

The chi-square test for independence is an AP Stats inference procedure that uses a two-way (contingency) table from a single sample to test whether two categorical variables are associated, by comparing observed counts to the counts expected if the variables were independent.

Verified for the 2027 AP Statistics examLast updated June 2026

What is the Chi-square test for independence?

The chi-square test for independence answers one question. In a single sample where you recorded two categorical variables for each individual (say, age group and favorite sport), is there a real association between those variables, or could the pattern in your two-way table just be chance?

Here's the logic. The null hypothesis says the two variables are independent, meaning knowing one tells you nothing about the other. From that assumption you calculate expected counts for each cell, using (row total × column total) / grand total. Then the chi-square statistic adds up, across every cell, (observed − expected)² / expected. If observed counts sit far from what independence predicts, the statistic gets big, the p-value gets small, and you reject independence in favor of an association. Degrees of freedom are (rows − 1)(columns − 1), and the usual conditions apply, namely random sampling, independence (the 10% condition if sampling without replacement), and all expected counts at least 5.

Why the Chi-square test for independence matters in AP Statistics

This test lives in Unit 8 of AP Statistics, the chi-square unit, alongside the goodness-of-fit test and the test for homogeneity. It's the capstone of your work with categorical data. Way back in Unit 1 you learned to read two-way tables and compare conditional distributions; this test is the formal inference version of that skill. Instead of just eyeballing whether row percentages look different, you now have a p-value to back up your claim. It also reinforces the full hypothesis-testing framework from Units 6 and 7 (hypotheses, conditions, calculations, conclusion in context), just applied to counts instead of proportions or means. Expect to see it in the inference FRQ and in multiple-choice questions about choosing the right test.

How the Chi-square test for independence connects across the course

Contingency Table (Units 1 & 8)

The contingency table (two-way table) is the raw material for this test. The chi-square test for independence is basically Unit 1's conditional-distribution comparison upgraded into a formal hypothesis test.

Chi-Square Statistic (Unit 8)

The test statistic is the same Σ(observed − expected)²/expected formula used in all three chi-square tests. What changes between tests is how the data were collected and what the hypotheses claim, not the math.

Degrees of Freedom (Unit 8)

For a test of independence, df = (rows − 1)(columns − 1). Getting df wrong gives you the wrong p-value, so this small calculation is a frequent point on rubrics.

Hypothesis Test & Null Hypothesis (Units 6-8)

This test follows the exact four-part structure you learned for proportions and means. The null hypothesis is that the two variables are independent (no association); the alternative is that an association exists.

Is the Chi-square test for independence on the AP Statistics exam?

Chi-square tests show up reliably on the AP Stats exam, both in multiple choice and in the inference FRQ. The 2026 exam's FRQ 5, for example, asked about an association between age-group and type of sport played among professional athletes, which is textbook test-of-independence territory. To earn full credit you have to do the whole procedure: state hypotheses in terms of association/independence (not in symbols like μ or p), name the test, check conditions (random sample, 10% condition, all expected counts ≥ 5), compute the chi-square statistic and df, find the p-value, and write a conclusion in context that links the p-value to the significance level. Multiple-choice stems often test whether you can pick the right chi-square test, compute an expected count, or interpret what a large chi-square value means. A classic trap is concluding 'the variables are independent' when you fail to reject; the correct phrasing is that you don't have convincing evidence of an association.

The Chi-square test for independence vs Chi-square test for homogeneity

These two tests use the identical statistic, df formula, and table, so the difference is all in the study design. Independence means ONE sample, two variables recorded per individual (survey 500 people, record age group and sport preference). Homogeneity means SEPARATE samples from two or more populations, comparing the distribution of one variable across them (sample 200 basketball players and 200 football players separately). On the exam, read how the data were collected before you name the test.

Key things to remember about the Chi-square test for independence

  • The chi-square test for independence checks whether two categorical variables measured on one sample are associated, using a two-way table of counts.

  • Expected counts come from assuming independence: (row total × column total) divided by the grand total for each cell.

  • Degrees of freedom equal (number of rows − 1) times (number of columns − 1).

  • Conditions to check are a random sample, independence (the 10% condition if sampling without replacement), and all expected counts at least 5.

  • Failing to reject the null does not prove the variables are independent; it only means you lack convincing evidence of an association.

  • It uses one sample with two variables, while the test for homogeneity uses separate samples from different populations. The data collection method, not the math, tells them apart.

Frequently asked questions about the Chi-square test for independence

What is the chi-square test for independence in AP Stats?

It's a Unit 8 hypothesis test that uses a two-way table from a single sample to decide whether two categorical variables are associated. It compares observed counts to the counts you'd expect if the variables were independent, using the statistic Σ(observed − expected)²/expected.

What's the difference between the chi-square test for independence and homogeneity?

Independence uses one sample with two categorical variables recorded per individual; homogeneity uses separate samples from two or more populations and compares the distribution of one variable across them. The calculations are identical, so the design of the study is what determines which test you name.

Does failing to reject the null prove the two variables are independent?

No. Failing to reject only means you don't have convincing evidence of an association at your significance level. Writing 'the variables are independent' as a conclusion is a common way to lose FRQ points.

How do you find degrees of freedom for a chi-square test for independence?

Multiply (rows − 1) by (columns − 1) using the categories in your two-way table, not the totals. A 3×4 table, for example, gives df = 2 × 3 = 6.

What conditions do I need to check for a chi-square test for independence?

You need a random sample, independent observations (the 10% condition if sampling without replacement), and a large counts condition requiring every expected count to be at least 5. Note that the large counts condition uses expected counts, not observed counts.