A two-sample z test for a difference of proportions checks whether two population proportions are really different or whether the gap you see could just be random chance. You calculate a pooled proportion, find the z test statistic, get a p-value, then compare that p-value to your significance level to decide whether to reject the null hypothesis.
Why This Matters for the AP Statistics Exam
This is the payoff step of the two-proportion test. Earlier topics had you set up hypotheses and check conditions; here you actually run the numbers and state a conclusion. On multiple-choice questions, you might compute or interpret a test statistic, identify the pooled proportion, or match a p-value to a correct decision. On free-response, a full inference question often expects you to show the test statistic, report the p-value, compare it to alpha, and justify a claim about the two populations in context. Getting comfortable with this process also sets you up for two-sample mean tests and chi-square tests later, which follow the same logic.

Key Takeaways
- The test statistic uses a pooled (combined) proportion: .
- The two-sample z-statistic is .
- The p-value is calculated assuming the null hypothesis is true, meaning the two population proportions are equal.
- Decision rule: if p-value , reject ; if p-value , fail to reject .
- Always state your conclusion in context and connect it back to the research question about both populations.
- A calculator's "2-PropZTest" gives you both the z-statistic and the p-value, so you mostly need to interpret the output correctly.
Calculating the Test Statistic and p-value
You need two values to carry out this test: a z test statistic and a p-value. Both should appear in your written work on an inference free-response question.
The Pooled Proportion
Because the null hypothesis says the two population proportions are equal, you combine both samples into one estimate called the pooled proportion:
This is the same as adding up all the successes and dividing by the total of both sample sizes. You use in the standard error because, under the null, both groups are assumed to share the same true proportion.
The z Test Statistic
The test statistic follows the general pattern of (statistic minus null value) divided by the standard deviation of the statistic:
The formula looks heavy, but a graphing calculator's "2-PropZTest" reports the z-statistic for you. This formula is not printed on the AP formula sheet, but you can rebuild it from the general test statistic formula and the standard error pieces that are provided.
The p-value
The p-value uses the standard normal curve and your z-statistic to find the probability of getting a difference as extreme or more extreme than what you observed, assuming the null hypothesis is true. A calculator handles this quickly. Match the tail to your alternative hypothesis:
- For , use the area above your z-statistic.
- For , use the area below your z-statistic.
- For , use both tails.
Concluding the Test
Once you have the p-value, compare it to the significance level to decide on the null hypothesis.
Using the p-value
The significance level is the probability of rejecting the null hypothesis when it is actually true, and it is often set at 0.05. Compare your p-value to :
- If the p-value , reject . The difference is statistically significant and unlikely to be due to chance, so you have evidence supporting the alternative.
- If the p-value , fail to reject . The difference could reasonably be due to chance, so you do not have convincing evidence for the alternative.
A quick way to remember it: if the p is low, reject the .
Remember that failing to reject is not the same as proving the null is true. It just means you do not have enough evidence to support the alternative.
Reading the z-statistic
If you only have the z-statistic, you can still judge how extreme the result is. On a normal curve, about 95% of values fall within two standard deviations of the mean and about 99.7% fall within three. A z-statistic far out in the tail (large in absolute value) points to a result that would be rare under the null, which supports rejecting it. The p-value is still the cleaner way to make and report the decision.
Worked Example: MJ vs. LeBron
Suppose Michael Jordan made 836 of 1623 shots and LeBron James made 622 of 1493 shots. You want to test whether the data give convincing statistical evidence that Jordan's true proportion of made shots is higher than LeBron's.
The hypotheses are and , where group 1 is Jordan and group 2 is LeBron.
Running the Test
Rather than grinding through the formula by hand, use a graphing calculator's "2-PropZTest." Enter the successes and sample sizes for each player, then select the correct alternative (). The output gives you the z-statistic and the p-value.
The two values to record are the z-statistic and the p-value.
Stating the Conclusion
Use the p-value to write the conclusion:
Since our p-value is less than 0.05, we reject . We have convincing evidence that the true proportion of shots made by Jordan is higher than the true proportion of shots made by LeBron.
In this example the z-statistic comes out around 5.5, which is far in the tail, and the p-value is essentially 0. Both point to the same decision: reject the null in favor of the alternative. This also matches what a confidence interval for the difference would tell you.
How to Use This on the AP Statistics Exam
Free Response
- Show the test statistic value and the p-value, not just a final yes/no answer.
- Explicitly compare the p-value to before stating your decision.
- Write the conclusion in context, naming both populations and the direction of the alternative.
- Tie the result back to the original research question about the two groups.
MCQ
- Be ready to compute or identify the pooled proportion .
- Know that the p-value is found assuming the two population proportions are equal.
- Match a given p-value and to the correct reject or fail-to-reject decision.
- Match the tail of the p-value to the alternative hypothesis.
Common Trap
Writing "we accept the null" or "we proved there is no difference" loses the point. Failing to reject only means there is not enough evidence for the alternative, never proof of the null.
Common Misconceptions
- Using separate sample proportions in the standard error instead of the pooled proportion. For the test, you assume the proportions are equal under the null, so you pool. The unpooled version belongs to the confidence interval, not the test.
- Thinking a small p-value proves a huge real-world difference. A small p-value means the result is statistically significant, but statistical significance is not the same as a large or practically important difference.
- Believing "fail to reject" proves the null hypothesis. It only means the evidence was not strong enough to support the alternative.
- Forgetting to state the conclusion in context. A bare "reject " is incomplete; name the populations and what the difference means.
- Mixing up the tail for the p-value. A one-sided alternative uses one tail; a two-sided alternative () uses both tails.
- Treating the significance level as fixed at 0.05. It is a common default, but the problem can set a different , and your decision depends on the one given.
Related AP Statistics Guides
- Unit 6 Overview: Inference for Categorical Data: Proportions
- 6.2 Constructing a Confidence Interval for a Population Proportion
- 6.1 Introducing Statistics: Why Be Normal?
- 6.4 Setting Up a Test for a Population Proportion
- 6.3 Justifying a Claim Based on a Confidence Interval for a Population Proportion
- 6.5 Interpreting p-Values
Vocabulary
The following words are mentioned explicitly in the College Board Course and Exam Description for this topic.Term | Definition |
|---|---|
difference in sample proportions | The difference between two sample proportions (p̂₁ - p̂₂) used to compare proportions from two different samples. |
difference of two population proportions | The comparison between two population proportions, expressed as p₁ - p₂, to determine if they differ significantly. |
null hypothesis | The initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference. |
p-value | The probability of observing a test statistic as extreme as or more extreme than the one calculated from the sample data, assuming the null hypothesis is true. |
pooled proportion | A combined estimate of the population proportion calculated from both samples when assuming the null hypothesis is true: p̂c = (n₁p̂₁ + n₂p̂₂)/(n₁ + n₂). |
population proportion | The true proportion or percentage of a characteristic in an entire population, typically denoted as p. |
reject the null hypothesis | The decision made when the p-value is less than or equal to the significance level, indicating sufficient evidence against the null hypothesis. |
significance level | The threshold probability (α) used to determine whether to reject the null hypothesis in a significance test. |
significance test | A statistical procedure used to determine whether there is sufficient evidence to reject the null hypothesis based on sample data. |
standard error | The standard deviation of a sampling distribution, which measures the variability of a sample statistic across repeated samples. |
test statistic | A calculated value used to determine whether to reject the null hypothesis in a hypothesis test, computed from sample data. |
Frequently Asked Questions
What is a two-proportion z-test?
A two-proportion z-test checks whether two population proportions differ by comparing the observed sample difference to what would be expected if the null hypothesis were true.
Why do you use a pooled proportion in a two-proportion z-test?
You use a pooled proportion because the null hypothesis assumes the two population proportions are equal, so the samples are combined to estimate that shared proportion.
What does the z-statistic measure in a two-proportion z-test?
The z-statistic measures how many standard errors the observed difference between sample proportions is from the null value, usually zero.
How do you interpret the p-value?
The p-value is the probability of getting a difference as extreme as the observed one, assuming the null hypothesis is true. Smaller p-values give stronger evidence against the null.
When do you reject the null hypothesis?
Reject the null hypothesis when the p-value is less than or equal to the significance level. Otherwise, fail to reject the null hypothesis.
How should I write the conclusion on an AP Statistics free-response question?
Compare the p-value to alpha, state reject or fail to reject, and write the conclusion in context using the two populations and the alternative hypothesis.