This technique tests whether two groups differ in the proportion of "successes" they produce. You'll use it whenever you're comparing rates or percentages across two independent samples, such as whether a new drug has a higher cure rate than a placebo, or whether two factories produce different rates of defective parts.

The workflow follows the same hypothesis testing framework you already know: set up hypotheses, check conditions, compute a test statistic, find a p-value, and make a decision. The new piece here is the pooled proportion, which combines both samples under the assumption that the null hypothesis is true.

more resources to help you study

practice questions

Hypothesis Tests for Population Proportions

Setting up the hypotheses:

The null hypothesis always assumes no difference between the two population proportions:

$H_0: p_1 = p_2$

Your alternative hypothesis depends on the research question:

$H_a: p_1 \neq p_2$ (two-tailed) — you're testing for any difference, higher or lower
$H_a: p_1 < p_2$ (left-tailed) — you suspect proportion 1 is lower
$H_a: p_1 > p_2$ (right-tailed) — you suspect proportion 1 is higher

Choosing a significance level:

Set $\alpha$ before you collect data or run the test. Common choices are 0.01, 0.05, and 0.10. Remember, $\alpha$ is the probability of committing a Type I error (rejecting $H_0$ when it's actually true). A smaller $\alpha$ means you need stronger evidence to reject.

Checking conditions:

Before running the test, verify these assumptions so the normal approximation holds:

The two samples are independent — drawn from separate populations with no overlap or influence on each other.
The sample sizes are large enough. Check all four of these:
- $n_1\hat{p}_1 \geq 5$ and $n_1(1 - \hat{p}_1) \geq 5$
- $n_2\hat{p}_2 \geq 5$ and $n_2(1 - \hat{p}_2) \geq 5$ This ensures at least 5 expected successes and 5 expected failures in each sample.

Making the decision:

If p-value $< \alpha$ (or the test statistic falls beyond the critical value), reject $H_0$ .
If p-value $\geq \alpha$ (or the test statistic does not reach the critical value), fail to reject $H_0$ .

Hypothesis tests for population proportions, Distribution of Differences in Sample Proportions (4 of 5) | Concepts in Statistics

Pooled Proportion and Test Statistic

Here's the step-by-step calculation process:

Step 1: Compute the pooled proportion.

The pooled proportion $\hat{p}$ estimates the common proportion under $H_0$ by combining both samples:

$\hat{p} = \frac{x_1 + x_2}{n_1 + n_2}$

where $x_1$ and $x_2$ are the number of successes, and $n_1$ and $n_2$ are the sample sizes.

Example: Sample 1 has 30 successes out of 100 observations. Sample 2 has 40 successes out of 120. Then $\hat{p} = \frac{30 + 40}{100 + 120} = \frac{70}{220} \approx 0.318$ .

You use the pooled proportion (rather than the individual sample proportions) in the standard error because you're assuming $H_0: p_1 = p_2$ is true. Under that assumption, both samples come from populations with the same proportion, so pooling gives the best single estimate.

Step 2: Compute the standard error.

The standard error measures how much variability you'd expect in $\hat{p}_1 - \hat{p}_2$ from sample to sample:

$SE = \sqrt{\hat{p}(1 - \hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}$

Larger sample sizes shrink the SE, giving you more precision and more power to detect real differences.

Step 3: Compute the test statistic.

The z-score tells you how many standard errors the observed difference sits from zero (the value $H_0$ predicts):

$z = \frac{\hat{p}_1 - \hat{p}_2}{SE}$

Example: If $\hat{p}_1 = 0.30$ , $\hat{p}_2 = 0.333$ , and $SE = 0.063$ , then $z = \frac{0.30 - 0.333}{0.063} \approx -0.53$ .

Under $H_0$ , this test statistic follows a standard normal distribution $N(0, 1)$ .

Step 4: Find the p-value.

Use the standard normal distribution:

Two-tailed: p-value $= 2P(Z > |z|)$
Left-tailed: p-value $= P(Z < z)$
Right-tailed: p-value $= P(Z > z)$

Hypothesis tests for population proportions, Estimate the Difference between Population Proportions (1 of 3) | Concepts in Statistics

Interpreting Two-Proportion z-Tests

When you reject $H_0$ :

If the p-value is less than $\alpha$ , you conclude there is sufficient evidence of a significant difference between $p_1$ and $p_2$ , in the direction your $H_a$ specifies.

Example: Testing $H_a: p_1 > p_2$ at $\alpha = 0.05$ . You get a p-value of 0.02. Since $0.02 < 0.05$ , reject $H_0$ and conclude there is sufficient evidence that $p_1$ is greater than $p_2$ .

When you fail to reject $H_0$ :

If the p-value is greater than or equal to $\alpha$ , you conclude there is not enough evidence to support a significant difference.

Example: Testing $H_a: p_1 \neq p_2$ at $\alpha = 0.01$ . You get a p-value of 0.07. Since $0.07 > 0.01$ , fail to reject $H_0$ . There is not sufficient evidence of a difference between $p_1$ and $p_2$ .

A critical distinction: failing to reject $H_0$ does not prove that $p_1 = p_2$ . It only means your data weren't convincing enough to rule out equality.

Contextual interpretation matters. Always tie your conclusion back to the real-world scenario. For instance, if you're comparing defective product rates at two plants and find a significant difference, that signals a need to investigate quality control at the worse-performing plant.

Also consider:

Practical significance vs. statistical significance. A statistically significant difference might be very small in absolute terms. Evaluate the effect size (the actual difference $\hat{p}_1 - \hat{p}_2$ ) to judge whether the difference matters in practice.
Study limitations. Non-random sampling, differences in data collection between groups, or confounding variables can all undermine the reliability of your results.

Additional Considerations

You can construct a confidence interval for $p_1 - p_2$ to estimate the range of plausible values for the true difference. Note: the CI formula uses individual sample proportions in the standard error, not the pooled proportion (since you're no longer assuming $H_0$ is true).
A power analysis before collecting data helps you determine the sample size needed to detect a meaningful difference at your chosen $\alpha$ .
The chi-square test for independence is an equivalent alternative when comparing two proportions. For a 2×2 contingency table, the chi-square statistic equals $z^2$ from the two-proportion z-test, and the results will match.