Fiveable

📊AP Statistics Unit 8 Review

QR code for AP Statistics practice questions

8.7 Skills Focus: Selecting an Appropriate Inference Procedure for Categorical Data

8.7 Skills Focus: Selecting an Appropriate Inference Procedure for Categorical Data

Written by the Fiveable Content Team • Last updated June 2026
Verified for the 2027 exam
Verified for the 2027 examWritten by the Fiveable Content Team • Last updated June 2026
📊AP Statistics
Unit & Topic Study Guides

Previous Exam Prep

AP Cram Sessions 2021

Pep mascot

The hardest part of chi-square inference is picking the right test from your toolbox. Use the chi-square goodness of fit test for one categorical variable in one sample, the chi-square test for independence for two categorical variables in one sample, and the chi-square test for homogeneity to compare the distribution of one categorical variable across two or more samples or treatments.

Why This Matters for the AP Statistics Exam

By Unit 8 you have a full set of inference procedures for categorical data, so the exam expects you to choose correctly before you compute anything. Picking the wrong test sets you up to lose credit on everything after it, even if your arithmetic is perfect.

You will likely see multiple-choice questions that ask you to identify the right procedure from a described scenario. The same skill shows up on free-response questions, where you name the test, state hypotheses in context, check conditions, calculate the statistic and p-value, and write a conclusion linked to that p-value. Getting comfortable with the selection step makes the rest of the problem faster and cleaner.

Key Takeaways

  • Goodness of fit: one sample, one categorical variable. You compare an observed distribution to a claimed or theoretical set of proportions.
  • Independence: one sample, two categorical variables measured on the same individuals. You ask whether the variables are associated.
  • Homogeneity: two or more samples, populations, or treatment groups, one categorical variable. You ask whether the distributions are the same across the groups.
  • A one-way table points toward goodness of fit. A two-way table points toward independence or homogeneity.
  • To split independence from homogeneity, count your samples: one sample with two variables is independence, separate samples or treatment groups compared on one variable is homogeneity.
  • For any chi-square test, conditions are random selection or randomized experiment and large counts, meaning all expected counts are at least 5.

How to Use This on the AP Statistics Exam

Choosing the Procedure

Work through the scenario in order:

  1. Confirm the data are categorical. If they are, you are choosing between a z-procedure for proportions and a chi-square test. A one-proportion or two-proportion z-test only fits when each variable has just two categories.
  2. Count the categorical variables and how many categories each has. Multiple categories push you toward chi-square.
  3. Check the table type. A one-way frequency table suggests goodness of fit. A two-way (r by c) table suggests independence or homogeneity.
  4. Count the samples or populations. One sample with two variables measured on it is independence. Separate samples, populations, or treatment groups compared on one variable is homogeneity.

Free Response

After you select the test, write it out fully:

  • Name the test you are running.
  • State hypotheses in words and in context, using the population from the question. For independence, the null is that the two variables are not associated. For homogeneity, the null is that the distributions are the same across the groups. For goodness of fit, the null specifies the proportions for each category.
  • Verify conditions: random sample or randomized experiment, the 10% condition when sampling without replacement, and all expected counts at least 5. Say expected counts, not observed counts.
  • Show the statistic clearly: χ2=(ObservedExpected)2Expected\chi^2 = \sum \dfrac{(\text{Observed} - \text{Expected})^2}{\text{Expected}}.
  • Use the right degrees of freedom: number of categories minus 1 for goodness of fit, and (r1)(c1)(r-1)(c-1) for a two-way table.
  • Find the p-value and write a conclusion in context tied to that p-value.

Common Trap

When the p-value is large, do not say there is "no association" or that the distributions are "the same." That wrongly accepts the null. Use language like "the data do not provide convincing evidence of an association" or "insufficient evidence of a difference in distributions."

Worked Example

A released free-response question gave a two-way table of gender and job experience for high school seniors in a district, based on one random sample.

  • The data are categorical, so it is a z-procedure or chi-square. Two variables with two to three categories each rules out the proportion z-tests, which only work with two categories.
  • The two-way table narrows it to independence or homogeneity.
  • One sample was taken and each person was classified by both gender and job experience, so this is association within one group, not a comparison of separate populations.

That makes it a chi-square test for independence, with hypotheses such as:

  • H0H_0: There is no association between gender and job experience for high school seniors in the district.
  • HaH_a: There is an association between gender and job experience for high school seniors in the district.

Practice Problems

(1) A researcher wants to know whether the distribution of favorite ice cream flavors among college students is the same as the distribution among the general population. They survey a random sample of 500 college students (280 chocolate, 120 vanilla, 50 strawberry, 50 mint) and a separate random sample of 1000 people from the general population (400 chocolate, 300 vanilla, 200 strawberry, 100 mint).

Which test should the researcher use and why?

(2) A scientist runs a clinical trial with 100 patients split into a treatment group and a control group, and wants to compare the occurrence of a disease across these groups.

Which test should the scientist use and why?

(3) A travel company wants to know whether the distribution of vacation package choices made by its customers fits a theoretical distribution based on previous years. They survey 1000 customers (400 beach, 300 mountain, 200 city, 100 rural).

Which test should the travel company use and why?

Answers

(1) Use a chi-square test for homogeneity. There are two separate samples (college students and the general population), and you are comparing the distribution of one categorical variable, favorite flavor, across them. Goodness of fit would fit only if you compared one sample to a fixed theoretical distribution. Independence would fit only if you had one sample classified by two variables.

(2) Use a chi-square test for homogeneity. The patients were assigned to two treatment groups, and you compare the distribution of disease occurrence across those groups. Goodness of fit would fit only if you compared one group to a theoretical distribution. Independence would fit only if a single sample were classified by two variables rather than split into separate treatment groups.

(3) Use a chi-square test for goodness of fit. There is one sample and one categorical variable, vacation package choice, and you are comparing the observed counts to a single theoretical distribution. Independence and homogeneity both require a two-way table, which you do not have here.

Common Misconceptions

  • "Independence and homogeneity are the same test." They use the same statistic and formula, but the setup differs. Independence comes from one sample classified by two variables. Homogeneity comes from separate samples or treatment groups compared on one variable.
  • "Comparing two groups always means independence." Comparing the distribution of one variable across two or more groups is homogeneity. The original answer to the ice cream problem labeling it independence is incorrect; two separate samples make it homogeneity.
  • "I check that observed counts are at least 5." The large-counts condition is about expected counts, not observed counts.
  • "A big p-value proves the variables are unrelated." A large p-value means you fail to reject the null, not that you accept it. Say the evidence is insufficient, not that there is no association.
  • "Any categorical data means chi-square." If you have one or two proportions with only two categories each, a one-proportion or two-proportion z-test can be the better fit.
  • "Degrees of freedom are always the same." Goodness of fit uses categories minus 1, while a two-way table uses (r1)(c1)(r-1)(c-1).

Frequently Asked Questions

How do I choose the right chi-square test in AP Stats?

Start by identifying the number of categorical variables and samples. Use goodness-of-fit for one categorical variable in one sample, independence for two categorical variables measured in one sample, and homogeneity for comparing one categorical variable across two or more samples or groups.

When do I use a chi-square goodness-of-fit test?

Use a chi-square goodness-of-fit test when you have one sample and one categorical variable, and you want to compare the observed counts to a claimed or theoretical distribution.

When do I use a chi-square test for independence?

Use a chi-square test for independence when one random sample is classified by two categorical variables and you want to know whether the variables are associated in the population.

When do I use a chi-square test for homogeneity?

Use a chi-square test for homogeneity when you have two or more samples, populations, or treatment groups and you want to compare the distribution of one categorical variable across those groups.

How do I tell independence from homogeneity?

Both tests use a two-way table, so focus on how the data were collected. One sample measured on two categorical variables is independence. Separate samples or treatment groups compared on one categorical variable is homogeneity.

When should I use a z-test for proportions instead of chi-square?

Use a one-proportion or two-proportion z-test when the question is about one or two proportions with two outcome categories. Chi-square tests are better when there are multiple categories or a larger r by c table.

Pep mascot
Upgrade your Fiveable account to print any study guide

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Click below to go to billing portal → update your plan → choose Yearly→ and select "Fiveable Share Plan". Only pay the difference

Plan is open to all students, teachers, parents, etc
Pep mascot
Upgrade your Fiveable account to export vocabulary

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Plan is open to all students, teachers, parents, etc
report an error
description

screenshots help us find and fix the issue faster (optional)

add screenshot