AP Statistics Unit 8 ReviewChi–Squares

Verified for the 2027 examCompiled by AP educators~2–5% of the exam
Pep mascot
Upgrade your Fiveable account to print any study guide

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Click below to go to billing portal → update your plan → choose Yearly→ and select "Fiveable Share Plan". Only pay the difference

Plan is open to all students, teachers, parents, etc
Pep mascot
Upgrade your Fiveable account to export vocabulary

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Plan is open to all students, teachers, parents, etc

AP Statistics Unit 8, Inference for Categorical Data: Chi-Square, covers chi-square tests across 7 topics and makes up 2-5% of the AP exam, with the chi-square statistic as the central tool for analyzing categorical data. You'll work through three distinct tests: goodness-of-fit, independence, and homogeneity. Each one compares observed counts to expected counts in a different context, from single distributions to two-way tables. AP Stats Unit 8 also trains you to pick the right procedure, which is its own tested skill.

unit 8 review

AP Statistics Unit 8 covers chi-square tests, the inference procedures for categorical data with more than two categories. The single biggest idea is comparing observed counts to expected counts. If real data sits far from what a null hypothesis predicts, the chi-square statistic gets big and you have evidence against that hypothesis. Unit 8 makes up 2-5% of the AP exam, and it comes with three closely related tests (goodness of fit, independence, and homogeneity) that you have to tell apart.

What this unit covers

The chi-square statistic and its distribution

  • Every test in this unit uses the same statistic, χ² = Σ (Observed − Expected)² / Expected. It adds up, cell by cell, how far the observed counts stray from the counts the null hypothesis predicts, scaled by the expected counts.
  • Squaring means every discrepancy counts as positive, so χ² is always at least zero. There is no such thing as a negative chi-square value.
  • Chi-square distributions are skewed right, take only positive values, and form a family of curves indexed by degrees of freedom. As degrees of freedom increase, the skew fades and the curve looks more symmetric.
  • The p-value is always the area in the right tail. A bigger χ² means observed counts sit farther from expected counts, which means stronger evidence against the null. You never do a two-sided chi-square test.

The goodness-of-fit test (one categorical variable)

  • Goodness of fit checks whether one categorical variable follows a claimed distribution. Think "are these dice fair?" or "do M&M colors match the company's stated percentages?"
  • The null hypothesis specifies a proportion for every category. The alternative says at least one of those proportions is wrong. Not all of them, just at least one.
  • Expected counts come from the formula (sample size)(null proportion). Expected counts are usually not whole numbers, and that is fine. Do not round them to integers.
  • Degrees of freedom equal the number of categories minus 1.
  • Conditions to verify in writing are random sampling (or a randomized experiment), the 10% condition when sampling without replacement, and the large counts condition that all expected counts are at least 5.

Tests for two-way tables: homogeneity and independence

  • Both tests use a two-way table, the same χ² formula, and degrees of freedom equal to (rows − 1)(columns − 1). What differs is the question and the data collection design.
  • Homogeneity compares the distribution of one categorical variable across two or more populations or treatments. The data come from a stratified random sample (separate samples from each population) or a randomized experiment. Null hypothesis is that there is no difference in the distributions across the groups.
  • Independence asks whether two categorical variables are associated in a single population. The data come from one simple random sample, classified two ways. Null hypothesis is that the two variables are independent (no association).
  • Expected counts in a two-way table come from the formula (row total)(column total) / table total. This is what the cell counts would look like if the null hypothesis were exactly true.
  • Conditions match goodness of fit in spirit, with the design check matched to the test, plus the 10% condition and all expected counts at least 5.

Choosing the right procedure

  • The last topic of the unit is pure decision-making. After Units 6, 7, and 8, you have a toolbox of inference procedures, and you have to pick the right one from the setup of a problem.
  • The quick sorting questions are these. How many categorical variables? One variable with two categories points to a one-proportion z procedure. One variable with three or more categories points to goodness of fit. Two variables (or one variable across several groups) points to a two-way table test.
  • For two-way tables, look at how the data were collected. One sample classified two ways means independence. Separate samples from several populations, or a randomized experiment with treatment groups, means homogeneity.

Unit 8, Chi, Squares at a glance

TestQuestion it answersData setupExpected countsdfNull hypothesis
Goodness of fitDoes one variable match a claimed distribution?One sample, one categorical variable, 2+ categoriesn × (null proportion)categories − 1Each category proportion equals its claimed value
IndependenceAre two variables associated in one population?One simple random sample, classified by two variables(row total)(col total)/table total(rows − 1)(cols − 1)The two variables are independent (no association)
HomogeneityIs the distribution the same across groups?Stratified random samples or randomized experiment(row total)(col total)/table total(rows − 1)(cols − 1)No difference in distributions across populations or treatments

Why Unit 8, Chi, Squares matters in AP Stats

Unit 8 completes your inference toolkit for categorical data. Units 6 covers one or two proportions, but real categorical variables often have many categories (eye color, political party, grade level), and chi-square is the procedure built for that. The unit also pushes the course's central skill, which is matching the right procedure to the question and the data design, not just running calculations.

  • It reinforces the core logic of all significance testing. State hypotheses, check conditions, compute a statistic and p-value, then make a decision in context. Chi-square is just a new statistic plugged into the same framework.
  • It turns the descriptive idea of association between categorical variables into a formal test you can use to back up a claim about a population.
  • The observed-versus-expected mindset, asking "is this variation just random chance or something real," is the heart of statistical thinking, and Unit 8 makes it explicit.

How this unit connects across the course

  • Two-way tables, marginal distributions, and conditional distributions first show up in exploring two-variable data (Unit 2). Chi-square tests for independence and homogeneity are the inference versions of those same tables, so a segmented bar graph from Unit 2 often previews what the test will conclude.
  • The design distinction between one sample classified two ways and separate samples from several groups comes straight from data collection (Unit 3). Whether the test is independence or homogeneity depends entirely on that sampling design.
  • The hypothesis test framework (hypotheses, conditions, p-value, conclusion at α) was built with proportions (Unit 6) and means (Unit 7). Chi-square reuses all of it. A 2×2 homogeneity test is even closely related to a two-proportion z-test from Unit 6.
  • The "pick the right procedure" skill expands again in inference for slopes (Unit 9), and the full AP exam expects you to choose among all of them.

Key formulas and procedures

  • χ² = Σ (Observed − Expected)² / Expected, the test statistic for all three chi-square tests. It measures total discrepancy between data and the null hypothesis.
  • Expected count (goodness of fit) = (sample size)(null proportion). Compute one for every category.
  • Expected count (two-way table) = (row total)(column total) / table total. Compute one for every cell.
  • df = (number of categories) − 1 for goodness of fit.
  • df = (rows − 1)(columns − 1) for independence and homogeneity.
  • p-value = area to the right of the χ² statistic under the chi-square curve with the correct df, found from the table or technology (χ²GOF-Test and χ²-Test on the calculator).
  • Conditions checklist for every chi-square test. Verify the random condition (matched to the test design), the 10% condition when sampling without replacement, and the large counts condition that all expected counts are at least 5.
  • The four-step test structure. State hypotheses in context, name the test and check conditions, calculate χ² and the p-value, then compare the p-value to α and write a conclusion about the population in context.

Unit 8, Chi, Squares on the AP exam

Chi-square content is 2-5% of the AP exam. In the multiple-choice section, expect questions that ask you to pick the correct test for a scenario, compute an expected count or degrees of freedom, identify correct hypotheses, or interpret a p-value or conclusion. The "which procedure?" question is a favorite because independence and homogeneity look identical on the surface.

On the free-response section, chi-square commonly appears as a full significance test. You name the test, state hypotheses in words (chi-square hypotheses are usually verbal, not symbolic), verify conditions including showing expected counts are at least 5, report the statistic, df, and p-value, and write a conclusion linked to α in the context of the problem. Free-response questions also like to pair a chi-square test with earlier skills, such as describing an association from a two-way table or segmented bar graph before testing it, or asking why the test is homogeneity rather than independence based on how the data were collected. Saying "chi-square test" without specifying which one costs you, so always name the exact test.

Essential questions

  • When counts in my data differ from what a model predicts, how do I tell random variation from a real effect?
  • How can one statistic, χ², handle questions about a single distribution, an association between two variables, and a comparison across several groups?
  • Why does the way data were collected, not the data table itself, determine whether the right test is independence or homogeneity?
  • How do I choose among all the inference procedures for categorical data once I know more than one?

Key terms to know

  • Chi-square statistic (χ²): A measure of how far observed counts fall from expected counts, summed across all categories or cells and scaled by the expected counts.
  • Observed count: The actual count recorded in a category or cell of your data.
  • Expected count: The count a category or cell would have if the null hypothesis were exactly true.
  • Chi-square distribution: A right-skewed family of curves, taking only positive values, whose shape depends on degrees of freedom.
  • Degrees of freedom: The number that picks which chi-square curve to use, equal to categories minus 1 (goodness of fit) or (rows − 1)(columns − 1) (two-way tables).
  • Goodness-of-fit test: A test of whether one categorical variable's distribution matches a set of claimed proportions.
  • Test for independence: A test of whether two categorical variables are associated in a single population, using one sample classified two ways.
  • Test for homogeneity: A test of whether the distribution of a categorical variable is the same across several populations or treatments.
  • Two-way table: A table of counts classified by two categorical variables, with row and column totals used to find expected counts.
  • Large counts condition: The requirement that every expected count be at least 5 so the chi-square distribution is a good model for the statistic.
  • p-value: The probability, assuming the null hypothesis and model are true, of getting a χ² statistic as large or larger than the one observed.
  • Null distribution: The distribution of the test statistic assuming the null hypothesis is true, which can be theoretical (chi-square) or built by randomization.

Common mix-ups

  • Independence versus homogeneity is the classic trap. Both use the same table, statistic, and df. Check the design. One random sample classified by two variables means independence. Separate samples from different populations, or groups in an experiment, means homogeneity.
  • The large counts condition is about expected counts, not observed counts. A cell with an observed count of 2 is fine as long as its expected count is at least 5.
  • Degrees of freedom use the table dimensions, not the sample size. A 3×4 table with 10,000 people still has df = (3 − 1)(4 − 1) = 6.
  • The alternative hypothesis for goodness of fit is "at least one proportion differs," not "all proportions differ." Rejecting the null does not tell you every category is off, and a follow-up look at which cells contribute most to χ² tells you where the discrepancy lives.

Frequently Asked Questions

What topics are covered in AP Stats Unit 8?

AP Statistics Unit 8 covers 7 topics focused on chi-square inference for categorical data: introducing unexpected results (8.1), setting up and carrying out a chi-square goodness of fit test (8.2-8.3), expected counts in two-way tables (8.4), setting up and carrying out chi-square tests for homogeneity or independence (8.5-8.6), and selecting the right inference procedure (8.7). The big idea is learning to compare observed counts to expected counts using the chi-square statistic. You'll work with three test types: goodness of fit, homogeneity, and independence. See AP Stats Unit 8 for matched practice on each topic.

How much of the AP Stats exam is Unit 8?

AP Statistics Unit 8 makes up 2-5% of the AP exam, which typically translates to a small number of multiple-choice questions and occasional FRQ appearances. The unit covers inference for categorical data using chi-square tests, including goodness of fit, homogeneity, and independence. Because the percentage is on the lower end, many students underestimate this unit. But chi-square questions are very learnable once you know how to set up hypotheses and calculate expected counts, so the payoff for focused study is high.

What's on the AP Stats Unit 8 progress check (MCQ and FRQ)?

The AP Stats Unit 8 progress check includes both MCQ and FRQ parts drawn from chi-square inference topics. The MCQ section tests your ability to identify correct hypotheses, calculate expected counts, interpret chi-square statistics, and choose between goodness of fit, homogeneity, and independence tests. The FRQ part typically asks you to set up and carry out a full chi-square test with a conclusion in context. The progress check pulls heavily from topics 8.2 through 8.7, so make sure you're solid on two-way tables (8.4), the conditions for chi-square tests, and selecting the right procedure (8.7). Practice with questions matched to each topic at AP Stats Unit 8.

How do I practice AP Stats Unit 8 FRQs?

AP Stats Unit 8 FRQs almost always ask you to carry out a full chi-square test, either goodness of fit (topics 8.2-8.3) or homogeneity/independence (topics 8.5-8.6), with a written conclusion in context. The question types look like: given a two-way table or a claimed distribution, state hypotheses, check conditions, calculate the chi-square statistic, find a p-value, and make a decision. To practice effectively, work through each step separately before putting them together. Focus on writing hypotheses correctly (goodness of fit uses one categorical variable; homogeneity/independence uses two), and always state your conclusion in terms of the original context. Find practice FRQs matched to these topics at AP Stats Unit 8.

Where can I find AP Stats Unit 8 practice questions?

The best place to find AP Stats Unit 8 practice questions, including multiple-choice and practice test sets, is AP Stats Unit 8. You'll find MCQ questions covering expected counts, chi-square calculations, and selecting the correct test type, plus FRQ practice for goodness of fit and homogeneity/independence tests. For a practice test feel, work through questions from all 7 topics in order. Pay extra attention to topic 8.7, which tests your ability to choose between procedures, since that skill shows up across both MCQ and FRQ formats.

How should I study AP Stats Unit 8?

Start AP Stats Unit 8 by making sure you understand why chi-square tests exist: they measure how far observed counts stray from expected counts. From there, study the three test types in order: goodness of fit (8.2-8.3), then homogeneity and independence (8.5-8.6), then practice choosing between them (8.7). Here's a concrete study plan: - Build expected counts by hand from two-way tables (8.4) before relying on a calculator. - Write out hypotheses in words for each test type until the difference feels automatic. - Practice full FRQ responses, checking that every conclusion names the p-value and refers back to the context. - Use topic 8.7 as a self-quiz: given a scenario, can you name the correct procedure without hints? Visit AP Stats Unit 8 for topic-by-topic practice to track where you need more work.