Pooled Sample

In AP Stats, a pooled sample combines two independent samples into one combined proportion, p̂c = (n₁p̂₁ + n₂p̂₂)/(n₁ + n₂), used in a two-proportion z-test because the null hypothesis assumes p₁ = p₂, so both samples estimate the same shared proportion.

Verified for the 2027 AP Statistics examLast updated June 2026

What is Pooled Sample?

A pooled sample is what you get when you merge two independent samples into one big sample to estimate a single shared proportion. The combined (or pooled) proportion is p̂c = (n₁p̂₁ + n₂p̂₂)/(n₁ + n₂). In plain terms, you add up all the successes from both groups and divide by the total number of people sampled.

Why would you ever do this? Because of the null hypothesis. In a two-proportion z-test, H₀ says p₁ = p₂ (the two populations have the same true proportion). If you're assuming the proportions are equal, it makes no sense to estimate them separately. Your single best estimate of that shared proportion is the pooled one, p̂c. That pooled value then gets plugged into the standard error formula for the test statistic, z = (p̂₁ - p̂₂) / √[p̂c(1-p̂c)(1/n₁ + 1/n₂)], and into the Large Counts check for the test.

Why Pooled Sample matters in AP Statistics

Pooling lives in Topic 6.10, Setting Up a Test for the Difference of Two Population Proportions, in Unit 6 (Inference for Categorical Data: Proportions). It directly supports learning objective 6.10.C, which requires you to verify conditions for inference. Per essential knowledge VAR-6.J.1, when you check that the sampling distribution of p̂₁ - p̂₂ is approximately normal for a test, you use the combined (pooled) proportion p̂c, not the two separate sample proportions. It also connects to 6.10.A and 6.10.B, because pooling only makes logical sense when the null hypothesis H₀: p₁ = p₂ is on the table. The big conceptual payoff is understanding that hypothesis tests are built on the assumption that H₀ is true, and pooling is that assumption written in math.

How Pooled Sample connects across the course

Null Hypothesis (Unit 6)

Pooling exists because of H₀: p₁ = p₂. If the null says both populations share one true proportion, then your best estimate of it uses all the data at once. Pooling is the null hypothesis turned into arithmetic.

Standard Error (Units 5-7)

The pooled proportion p̂c replaces the separate p̂₁ and p̂₂ inside the standard error of the test statistic. That gives the formula √[p̂c(1-p̂c)(1/n₁ + 1/n₂)], which is the version that shows up in the z-test for two proportions.

Large Counts Condition (Unit 6)

For a two-proportion test, you check normality with the pooled proportion. You need n₁p̂c, n₁(1-p̂c), n₂p̂c, and n₂(1-p̂c) all at least 10. Using the separate sample proportions here is a classic condition-checking mistake.

Confidence Interval (Units 6-7)

Confidence intervals for p₁ - p₂ do NOT pool. There's no null hypothesis claiming the proportions are equal, so each sample keeps its own p̂ in the standard error. Test = pool, interval = don't pool. This contrast is one of the most-tested distinctions in Unit 6.

Is Pooled Sample on the AP Statistics exam?

Multiple-choice questions love to hand you the two-proportion z-statistic formula and ask what the p̂ in the middle represents. The answer is the pooled (combined) sample proportion, the overall success rate when both samples are merged. Other stems ask what the pooled proportion is used for, and the answer is twofold. It goes into the standard error of the test statistic, and it's the proportion you use to verify the Large Counts condition for the test. On an FRQ that asks you to carry out a two-proportion z-test, you earn condition-checking credit by computing p̂c and showing all four pooled counts are at least 10. Watch the trap question that mixes tests and intervals. If the problem asks for a confidence interval for p₁ - p₂ (like comparing voter support between two counties), you do not pool, because no null hypothesis is being assumed.

Pooled Sample vs Unpooled (separate) proportions in a confidence interval

Pooling is only justified when you assume p₁ = p₂, which is exactly what H₀ says in a significance test. A confidence interval for p₁ - p₂ makes no such assumption, so each sample keeps its own proportion in the standard error: √[p̂₁(1-p̂₁)/n₁ + p̂₂(1-p̂₂)/n₂]. Quick rule of thumb to memorize before the exam: hypothesis test for two proportions, pool. Confidence interval for two proportions, don't pool.

Key things to remember about Pooled Sample

  • The pooled sample proportion is p̂c = (n₁p̂₁ + n₂p̂₂)/(n₁ + n₂), which is just total successes from both samples divided by total sample size.

  • You pool because the null hypothesis H₀: p₁ = p₂ assumes one shared population proportion, and p̂c is your best estimate of it using all the data.

  • The pooled proportion goes into the standard error of the two-proportion z-test statistic: z = (p̂₁ - p̂₂) / √[p̂c(1-p̂c)(1/n₁ + 1/n₂)].

  • When checking the Large Counts condition for a two-proportion test, use p̂c, so verify that n₁p̂c, n₁(1-p̂c), n₂p̂c, and n₂(1-p̂c) are all at least 10.

  • Confidence intervals for p₁ - p₂ never use the pooled proportion, because intervals don't assume the null hypothesis is true.

  • Pooling does not remove the need for the independence checks; you still need two independent random samples (or a randomized experiment) and the 10% condition for each sample.

Frequently asked questions about Pooled Sample

What is a pooled sample proportion in AP Stats?

It's the combined proportion from two samples, p̂c = (n₁p̂₁ + n₂p̂₂)/(n₁ + n₂), used in a two-proportion z-test. It treats both samples as one big sample because the null hypothesis assumes both populations share the same true proportion.

Why do we pool for a hypothesis test but not for a confidence interval?

A test assumes H₀: p₁ = p₂ is true, so both samples estimate one common proportion and pooling makes sense. A confidence interval assumes nothing about the proportions being equal, so each sample keeps its own p̂ in the standard error.

Do I use the pooled proportion to check the Large Counts condition?

Yes, but only for a test. Per the CED (VAR-6.J.1), you check that n₁p̂c, n₁(1-p̂c), n₂p̂c, and n₂(1-p̂c) are all at least 10 using the pooled proportion. For a confidence interval, you check Large Counts with the separate sample proportions instead.

Is the pooled proportion just the average of the two sample proportions?

No, not unless the sample sizes are equal. It's a weighted average, so the bigger sample pulls p̂c toward its proportion. The safest way to compute it is total successes divided by total sample size.

How is the pooled proportion different from p̂₁ - p̂₂?

p̂₁ - p̂₂ is the observed difference between the two samples and sits in the numerator of the z-statistic. The pooled proportion p̂c is a single combined estimate that sits inside the standard error in the denominator. One measures the gap; the other helps measure how surprising that gap would be if H₀ were true.