Fiveable

📊AP Statistics Unit 9 Review

QR code for AP Statistics practice questions

9.6 Skills Focus: Selecting an Appropriate Inference Procedure

📊AP Statistics
Unit 9 Review

9.6 Skills Focus: Selecting an Appropriate Inference Procedure

Written by the Fiveable Content Team • Last updated September 2025
Verified for the 2026 exam
Verified for the 2026 examWritten by the Fiveable Content Team • Last updated September 2025
📊AP Statistics
Unit & Topic Study Guides
Pep mascot

One of the most important skills in AP Statistics is being able to identify the best inference procedure to use in order to complete a hypothesis test or confidence interval. We have covered all of the following types of procedures: 📄

  • One Proportion Z Test
  • One Proportion Z Interval
  • One Sample T Test
  • One Sample T Interval
  • Matched Pairs T Test
  • Two Proportion Z Test
  • Two Proportion Z Interval
  • Two Sample T Test
  • Two Sample T Interval
  • Chi Squared Goodness of Fit Test
  • Chi Squared Test for Independence
  • Chi Squared Test for Homogeneity
  • Linear Regression T Interval
  • Linear Regression T Test

For example, If given a problem involving one of the linear regression t procedures, it is most common that you will be given a computer output and be asked to make a conclusion or construct an interval.

Here are a couple illustrative flowchart "cheat sheets" on picking the right inferential procedure. Good luck! ⭐

Source: Mr. Sardinha
Pep mascot
more resources to help you study
Source: Reddit

Example 1

Here is a computer output similar to what you would see on the AP test. This is based on a study with a sample size of 30. 

Remember from Unit 2, that we are only focusing on the inference values associated with the slope, which is the row entitled “Sick Days.”

Confidence Interval

In order to construct a confidence interval like we discussed in Section 9.2, we will need the point estimate (sample slope), t-score and standard error.

Everything except our t-score is given in the computer output, so we have to calculate our t-score based on our confidence level and sample size. We will first calculate our degrees of freedom of 28 and then use that with the invT function to calculate our t-score. We get a t-score of 2.05 for a 95% confidence level.

For the computer output above, our confidence interval would be:

0.962.05(0.12)

Which comes out to be (0.714, 1.206).

In this case, we can be sure that the two variables of interest (sick days and wellness visits) because 0 is not contained in our interval and therefore there is evidence that the two have some correlation. This is also supported by our high r value that could be easily computed by the R2 value.

Hypothesis Test

The other option for inference would be to use the p-value to make a judgment on the hypothesis test. In this example, our p-value for the slope is 0.02, which is usually considered significant enough to reject our null hypothesis.

In this instance, our conclusion would be:

  • Since our p value 0.02<0.05, we reject the null hypothesis. We have significant evidence that the true slope of the regression model between the number of sick days taken and the number of wellness visits is not 0.

Again, since we have some evidence that the slope is not 0, this shows that these two things are correlated, which is also evidenced by the R2 and resulting correlation coefficient.

Example 2: Pick a Test!

(1) A marketing research firm is interested in determining whether the proportion of adults in the United States who use a certain brand of toothpaste is significantly different from 50%. They survey a random sample of 500 adults and find that 270 of them use the toothpaste. Which of the following tests is/are appropriate to use?

(2) A high school statistics teacher wants to determine whether the mean score on a certain statistics exam is significantly different from 80. They administer the exam to a random sample of 25 students and find that the mean score is 78. Which of the following tests is/are appropriate to use?

(3) A psychology researcher is interested in determining whether there is a significant difference in anxiety levels between a treatment group and a control group. They measure anxiety levels in both groups before and after an intervention and find that the mean difference in anxiety levels between the two groups is -5. Which of the following tests is/are appropriate to use?

(4) A political pollster is interested in determining whether the proportion of registered voters who support a certain candidate is significantly different from 40%. They survey a random sample of 1000 registered voters and find that 400 of them support the candidate. They also survey a random sample of 1000 registered voters from a different region and find that 300 of them support the candidate. Which of the following tests is/are appropriate to use?

(5) A nutritionist is interested in determining whether the mean daily caloric intake of a certain population is significantly different from 2000 calories. They collect data from a random sample of 50 individuals from the population and find that the mean caloric intake is 1950 calories. Which of the following tests is/are appropriate to use?

(6) A historian is interested in determining whether the distribution of birth months among a group of people is significantly different from a uniform distribution. They collect data on the birth months of a random sample of 100 people and find that there are more births in the summer months than in the winter months. Which of the following tests is/are appropriate to use?

(7) A sociologist is interested in determining whether there is a significant association between the type of car a person drives and their political party affiliation. They collect data on the car types and political party affiliations of a random sample of 100 people and find that there are more Democrats who drive sedans than Republicans. Which of the following tests is/are appropriate to use?

(8) A medical researcher is interested in determining whether there is a significant difference in the effectiveness of two different treatments for a certain medical condition. They randomly assign patients to receive either treatment A or treatment B and measure the percentage of patients who show improvement after receiving each treatment. They find that the percentage of patients who show improvement is significantly higher for treatment A than for treatment B. Which of the following tests is/are appropriate to use?

(9) A real estate agent is interested in determining whether there is a significant relationship between the size of a house (in square feet) and its sale price. They collect data on the sizes and sale prices of a random sample of houses and find that there is a positive relationship between the two variables. Which of the following tests is/are appropriate to use?

Answers

(1) One Proportion Z-Test, One Proportion Z-Interval

(2) One Sample T-Test, One Sample T-Interval

(3) Matched Pairs T-Test

(4) Two Proportion Z-Test, Two Proportion Z-Interval

(5) Two Sample T-Test, Two Sample T-Interval

(6) Chi Squared Goodness of Fit Test

(7) Chi-Squared Test for Independence

(8) Chi-Squared Test for Homogeneity

(9) Linear Regression T-Test, Linear Regression T-Interval

Frequently Asked Questions

How do I know when to use a z-test vs a t-test for means?

Use a z-test for a mean only when the population standard deviation σ is known (rare in practice) and the sampling distribution of the mean is approx. normal. If σ is unknown (the usual case) you must use a t-test, which replaces σ with the sample SD s and uses a t-distribution with df = n−1 (one sample) or the Welch df for two samples (pooled t only when you can assume equal variances). Quick decision flow: - One-sample or two-sample mean, σ unknown → use t-test (or paired t for matched pairs). - σ known → z-test for mean (textbook case only). - For large n (CLT) the t and z give similar results, but AP expects t when σ is unknown. Check normality (histogram/CLT) and independence (random sample, 10% rule). Use Table B (t critical values) on the AP formula sheet. Want targeted practice on picking procedures? See the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ) and try more problems (https://library.fiveable.me/practice/ap-statistics).

What's the difference between a one-sample and two-sample test?

One-sample vs two-sample tests—short version: a one-sample test compares a sample statistic to a single population value (e.g., test whether μ = 10 or p = 0.5). A two-sample test compares the difference between two independent groups (e.g., μ1 − μ2 or p1 − p2) to see if that difference is zero or some other value. Key AP details you should check before picking a test (Skill 1 in the CED): - Are you comparing one group to a known value? → one-sample t or z for a mean/proportion. - Are you comparing two independent groups? → two-sample t (pooled rarely used) or two-proportion z. - Are the two measurements matched/paired (same subjects before/after)? → use paired (matched pairs) t (that’s effectively a one-sample test on differences). - Conditions: CLT/normality for means (or use t), success–failure for proportions, equal-variance assumption when considering pooled vs unpooled t, independence/random sampling. For more examples and a decision flow for choosing procedures, check the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ). For extra practice, Fiveable has tons of practice problems (https://library.fiveable.me/practice/ap-statistics).

When do I use a chi-square test instead of a t-test?

Use a chi-square test when your data are categorical; use a t-test when your data are quantitative (means). Quick checklist: - Chi-square tests (goodness-of-fit or test of independence): both variables are categorical. Use χ² = Σ[(obs − exp)²/exp]. Conditions: counts are from a random sample, expected counts (usually) ≥ 5 for each cell, and degrees of freedom = (rows−1)(cols−1) for independence or k−1 for goodness-of-fit. AP covers these in Unit 8/Topic 9.6 choices about inference procedures. See the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ). - t-tests (one-sample, two-sample, paired): response is quantitative and you’re testing a mean (μ or μ1−μ2). Conditions: random sample or experiment, approximately normal sampling distribution (CLT helps if n is large), and consider equal-variances vs unpooled two-sample t and correct df. Use paired t for matched pairs. Also note: for proportions use z-tests/CI for p (success–failure condition). For more practice problems, go to Fiveable practice (https://library.fiveable.me/practice/ap-statistics).

I'm confused about when to use paired t-test vs two-sample t-test - can someone explain?

Use a paired (matched-pairs) t when your data are two measurements on the same unit (before/after, left/right, matched subjects). You analyze the n differences with a one-sample t on those differences: check independence of pairs, approximate normality of the differences (or large n), and use t(df = n−1). Use a two-sample t when you have two independent groups (different people in each group). Then compare group means with a t for (x̄1 − x̄2). Check: independent samples, each group’s distribution approx. normal (or large n), and 10% condition if sampling without replacement. Decide pooled vs unpooled: only pool variances when you have good reason to assume equal variances (homoscedasticity); otherwise use the unpooled Welch t (more common). Degrees of freedom for Welch t are adjusted (calculator/computer gives df). On the AP exam you must “select statistical methods” (Skill 1) and verify conditions—practice choosing between paired vs two-sample problems (see the Topic 9.6 study guide: https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ). For extra practice, try problems at (https://library.fiveable.me/practice/ap-statistics).

How do I decide if I need a one-tailed or two-tailed test?

Decide by the research question—the alternative hypothesis (Ha) determines one- vs two-tailed. If you want to detect any difference (not specified direction), use a two-tailed test: Ha: parameter ≠ value (rejecting H0 means “different”). If you’re specifically testing for an increase or a decrease, use a one-tailed test: Ha: parameter > value (right-tailed) or Ha: parameter < value (left-tailed). Key rules: 1) Pick the tail BEFORE looking at the data. 2) Your p-value is computed according to that tail: two-tailed doubles the one-sided tail area for symmetric tests. 3) Be explicit in context (e.g., Ha: μ > 50 means you’re testing for an increase in the population mean). On the AP exam you’ll be asked to identify Ha and the correct tail, compare p to α, and state conclusions in context (Topic 9.6 skills). For more guidance and practice, see the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ) and the Unit 9 overview (https://library.fiveable.me/ap-statistics/unit-9). Practice problems are at (https://library.fiveable.me/practice/ap-statistics).

What are all the conditions I need to check before doing any inference test?

Before you pick an inference test check these conditions (quick checklist by test): General first: randomness (random sample or randomized experiment) and independence (sample ≤ 10% of population when sampling without replacement). Proportions (z-test / CI): success–failure: np ≥ 10 and n(1–p) ≥ 10 (or use bootstrap/permutation if not). Use pooled p̂ for two-proportion tests when H0 assumes equal p. Means (one/two-sample t, paired t): roughly normal sampling distribution—either population approximately normal or n ≥ 30 (CLT). For two-sample t check equal-variances if you plan to pool; otherwise use unpooled (Welch). For paired use differences and treat as one-sample t on differences. Regression slope (t for β): linear relationship, residuals ~ Normal for each x, constant variance (homoscedasticity), independence of observations, and sufficient sample size (df = n−2). Check residual plot. Chi-square tests: expected count ≥ 5 (or mostly ≥5) in each cell for goodness-of-fit / independence; independence/random sample. ANOVA (one-way): independent groups, roughly normal groups (or large n), equal variances. If conditions fail, use bootstrap CIs or permutation/randomization tests. For more AP-aligned practice and summaries see the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ) and unit overview (https://library.fiveable.me/ap-statistics/unit-9). For extra practice problems go to (https://library.fiveable.me/practice/ap-statistics).

When do I use a proportion test vs a mean test?

Use a proportion test when your response is categorical (yes/no, success/failure) and you’re estimating or testing a population proportion p (one-sample z for p, two-sample z for p1−p2). Use a mean test when your response is quantitative and you’re estimating or testing a population mean μ (one-sample t, two-sample t, paired t for matched data, or t for slope in regression). Key checklist from the CED: - Data type: categorical → proportion test; quantitative → mean test. - Sample size/conditions: for proportions check the success–failure rule (np and n(1−p) large) so CLT gives approx normal; for means check normality of the population or that n is large (CLT) or use t if σ unknown. - If two related measurements on same unit → paired t. If two independent samples → two-sample t (decide pooled vs unpooled if variances assumed equal). On the AP exam you’ll pick the test, state H0/Ha, verify conditions, compute statistic and p-value (tables/calculator allowed). For more practice and a quick decision guide see the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ) and lots of problems at (https://library.fiveable.me/practice/ap-statistics).

How do I know if I should use a confidence interval or a hypothesis test?

Use a confidence interval when your goal is estimation—you want a plausible range for a population parameter (mean, slope, difference, proportion). Use a hypothesis test when your goal is decision—you’re testing a specific claim (H0 vs Ha) and want a p-value or reject/do-not-reject conclusion. Quick checklist (AP-style): - Type of parameter: mean → t (or z if σ known), slope → t for regression, proportion → z (use success–failure), categorical association → χ², more than 2 means → ANOVA. - Paired data → paired t; two independent samples → two-sample t (check equal-variances rule for pooled vs unpooled). - Conditions: CLT/normality for means or large n, success–failure for proportions, expected cell counts for χ², linearity and equal variance for regression slope. If conditions fail, use bootstrap CIs or permutation tests (AP recognizes these). - On the exam: show hypotheses or CI formula, check conditions, use correct t/z/χ²/ANOVA table or calculator—you’ll get formula sheets and tables (CED). For more targeted choosing practice and examples, see the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ) and the Unit 9 overview (https://library.fiveable.me/ap-statistics/unit-9). Need practice? Thousands of problems are at (https://library.fiveable.me/practice/ap-statistics).

What's the step-by-step process for choosing the right inference procedure?

Quick step-by-step you can use on the exam: 1) Identify the parameter and data type: mean (quantitative), proportion (categorical), difference of means/proportions, slope, or association (chi-square/ANOVA). 2) Design and pairing: Is it one sample, two independent samples, or paired/matched? Paired → paired t. Independent → one- or two-sample procedures. 3) Pick the method: one-sample t / two-sample t (or pooled t if equal-variances and justified) for means; z-test/CI for proportions; t for slope (regression); chi-square GOF or independence for categorical tables; ANOVA for >2 group means; bootstrap/permutation when conditions fail or for simulation-based inference. 4) Check conditions: randomization/independence, CLT/normality (or look at skew and n), success–failure for proportions, equal-variances for pooled t, expected counts for chi-square. Use t df or z as required. 5) If any condition fails, switch to bootstrap or permutation/randomization methods. Want a checklist and examples? See the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ) and try practice problems (https://library.fiveable.me/practice/ap-statistics).

I don't understand when to use ANOVA vs just a regular t-test?

Use a t-test when you’re comparing means for two groups; use one-way ANOVA when you’re comparing means for three or more groups. Quick rules tied to the CED: - One-sample t: compare one mean to a value. - Two-sample t (independent): compare means of two independent groups. Choose pooled t if equal-variances (homoscedasticity) is plausible; otherwise use unpooled (Welch) t. - Paired t: compare two related measurements (matched pairs). - One-way ANOVA: test H0: all group means are equal vs Ha: at least one differs—appropriate when k ≥ 3 groups. The ANOVA F statistic follows an F distribution and assumes independent samples, approximate normality within groups, and equal variances across groups (check those conditions). On the AP exam you must identify the right test from context (Skill 1) and verify conditions (normality, CLT, equal-variances when required). For practice picking procedures and checking conditions, see the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ) and hundreds of practice questions (https://library.fiveable.me/practice/ap-statistics).

How do I solve problems where they don't tell me which test to use?

Start with a short checklist every time—it’ll make the choice automatic. 1. What parameter are you testing? mean, proportion, slope (regression), or association (chi-square/ANOVA)? 2. How many groups/samples? one-sample, two independent samples, or paired/matched pairs? 3. Data type: quantitative → t-tests/ANOVA/regression; categorical → z-test for proportions, chi-square goodness-of-fit or independence. 4. Conditions: For proportions check success–failure (np ≥ 10). For means check normality or CLT (n≥30) and whether variances are equal (pooled vs. unpooled t). For slope use t-test for b with linearity, normal residuals, equal variance. If conditions fail, use bootstrap or permutation/randomization tests. 5. If >2 groups for means, use one-way ANOVA; if >2 categorical levels and testing independence, use χ2 test. On the AP exam you’re being scored on Skill 1 (selecting methods), so explicitly state these decisions and which conditions you checked. For practice and guided examples see the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ) and the Unit 9 overview (https://library.fiveable.me/ap-statistics/unit-9). For extra practice try the 1000+ AP practice problems (https://library.fiveable.me/practice/ap-statistics).

When do I need to use a chi-square goodness of fit vs independence test?

Use a chi-square goodness-of-fit when you have one categorical variable and you want to test whether the observed counts match a specific expected distribution (e.g., "are birthdays equally likely across months?" or "do voters follow the party proportions the poll claims?"). Goodness-of-fit compares observed vs expected counts; df ≈ k − 1 (k = # categories). Use a chi-square test of independence when you have two categorical variables and you want to test whether they’re associated (e.g., "is smoking status related to exercise level?"). That uses a contingency table and df = (rows − 1)(cols − 1). Common AP/CEF checks for both: data are counts from random sample, expected count in each cell/category ≳ 5 (if not, consider combining categories or using an exact/simulation method), and you compute χ² = Σ (obs − exp)²/exp. On the AP exam you’ll be expected to pick the right test, state H0/Ha in context, check conditions, and interpret p-value. For a quick refresher, see the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ); for more practice problems go to (https://library.fiveable.me/practice/ap-statistics).

What's the difference between all these different t-tests and when do I use each one?

Quick guide—when to use which t-test and why: - One-sample t-test: use when you compare a sample mean to a known population value (H0: μ = value). Conditions: random, independent (10% rule), sample distribution ≈ normal or n big enough (CLT). Uses t with df = n−1. - Two-sample (independent) t-test: compare means from two independent groups (H0: μ1 = μ2). Use unpooled (Welch) t by default unless you can justify equal variances (homoscedasticity). SE = sqrt(s1^2/n1 + s2^2/n2); df depends on variances/sample sizes. - Paired (matched pairs) t-test: use when observations are paired (before/after, matched subjects). Convert to one-sample test on differences; df = n_pairs − 1. Key AP stuff to remember: check normality (plots/outliers), independence, and success-failure only for proportions (z-tests). Use pooled vs. unpooled only if equal-variance condition is justified. If assumptions fail or samples are small, use bootstrap CI or permutation/randomization tests (also tested on the AP). For categorical comparisons use z for proportions or chi-square tests and ANOVA for >2 group mean comparisons. For more AP-aligned practice and quick review see the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ), the Unit 9 overview (https://library.fiveable.me/ap-statistics/unit-9), and tons of practice questions (https://library.fiveable.me/practice/ap-statistics).

Can someone make a flowchart for picking the right statistical test?

Quick flowchart you can keep in your head: 1) What’s your response? Categorical (counts/proportions) or quantitative (means/slopes)? - If categorical → consider z-tests for proportions (one- or two-sample) or chi-square (goodness-of-fit or independence for tables). - Check success–failure: np and n(1−p) ~≥10 for z-approx. - For tables use expected counts ≥5 (chi-square condition). - If quantitative → are you comparing means or modeling relationship? - Comparing two related measurements → matched-pairs t (paired). - Comparing two independent groups → two-sample t (check normality or large n; check equal-variances for pooled vs unpooled). - One group vs a value → one-sample t. - More than two groups → ANOVA (one-way). - Regression slope → t-test for slope (check linearity, normal residuals, constant variance). 2) Are model assumptions violated or sample small? → use permutation/randomization tests or bootstrap CIs. 3) Always state H0/Ha, check CLT/normality, and report df/SE/z or t and p-value. For AP-aligned practice and a tidy flowchart, see the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ) and Unit 9 overview (https://library.fiveable.me/ap-statistics/unit-9). For lots of practice problems go to (https://library.fiveable.me/practice/ap-statistics).

Why does my calculator keep giving me different p-values for different tests?

Your calculator gives different p-values because each test uses a different sampling distribution, standard error, and degrees of freedom—so the same data plugged into different procedures yields different null-model probabilities. Quick checklist of why p-values change: - Test choice changes the statistic and SE: one-sample t uses s/√n; two-sample (pooled vs unpooled) uses √(s1²/n1 + s2²/n2); z for proportions uses a pooled p̂ in the SE when H0 assumes equality. Those different SEs change the z/t value and p. - Paired test = one-sample test on differences (usually smaller SE → different p). - t vs z: t has thicker tails and df matters, so p depends on df. - Conditions matter: CLT/normality and success–failure affect whether z, t, or a simulation (bootstrap/permutation) is appropriate—simulated/randomization tests give p-values that can differ from parametric tests. - Also check calculator settings: order of samples, direction of Ha (one- vs two-tailed), and rounding. On the AP exam you must pick the correct procedure (paired vs two-sample, pooled vs unpooled, z vs t, chi-square, ANOVA, or a simulation-based test) and justify conditions (CLT, normality, success–failure). For a focused review, see the Topic 9.6 study guide (https://library.fiveable.me/ap-statistics/unit-9/selecting-an-appropriate-inference-procedure/study-guide/TYrF9PsFWB1lTadg0ljQ) and practice problems (https://library.fiveable.me/practice/ap-statistics).