Fiveable

📊Honors Statistics Unit 8 Review

QR code for Honors Statistics practice questions

8.3 A Population Proportion

8.3 A Population Proportion

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
📊Honors Statistics
Unit & Topic Study Guides
Pep mascot

Confidence Intervals and Sample Size for Population Proportions

Pep mascot
more resources to help you study

Confidence intervals for proportions

When you collect sample data, you rarely know the true population proportion pp. A confidence interval gives you a range of plausible values for pp based on what you observed in your sample.

The formula for a confidence interval for a population proportion:

p^±zp^(1p^)n\hat{p} \pm z^* \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}

  • p^\hat{p} is the sample proportion (number of successes divided by nn)
  • zz^* is the critical value from the standard normal distribution, determined by your confidence level
  • nn is the sample size

Common critical values you should know:

Confidence Levelzz^*
90%1.645
95%1.96
99%2.576

Conditions that must be met before using this formula:

  1. Random sample: The data must come from a random sampling method.
  2. Independence (10% condition): The population must be at least 10 times larger than the sample size (N10nN \geq 10n). This ensures that sampling without replacement doesn't meaningfully affect the results.
  3. Success-failure condition (Large Counts): np^10n\hat{p} \geq 10 and n(1p^)10n(1-\hat{p}) \geq 10. This is the condition that actually justifies using the normal approximation. A generic "n30n \geq 30" rule does not apply here; what matters is having enough successes and enough failures in your sample.

The Central Limit Theorem is what makes this work: when these conditions are satisfied, the sampling distribution of p^\hat{p} is approximately normal, centered at pp with standard deviation p(1p)n\sqrt{\frac{p(1-p)}{n}}.

Confidence intervals for proportions, 8.3: Confidence Interval for Population Proportion - Statistics LibreTexts

Interpretation of margin of error

The margin of error (ME) is the "±\pm" part of your confidence interval:

ME=zp^(1p^)nME = z^* \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}

It tells you how far your sample proportion could reasonably be from the true population proportion. A smaller margin of error means a more precise estimate. For example, a poll reporting "52% ± 2%" is much more useful than one reporting "52% ± 8%."

The standard error of the sample proportion is the piece inside the margin of error without the zz^*:

SE=p^(1p^)nSE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}

This measures the typical amount that p^\hat{p} varies from sample to sample. The margin of error is just the standard error scaled up by zz^*.

Three factors control the width of your interval:

  • Confidence level: Higher confidence means a larger zz^*, which widens the interval. You're casting a wider net to be more sure you've captured pp.
  • Sample size: Larger nn shrinks the standard error, narrowing the interval. Notice that nn is under a square root, so you need to quadruple the sample size to cut the margin of error in half.
  • Sample proportion: The product p^(1p^)\hat{p}(1-\hat{p}) is largest when p^=0.5\hat{p} = 0.5, so proportions near 50% produce the widest intervals.

Interpreting a confidence level correctly: A 95% confidence level means that if you repeated the sampling process many times and built a confidence interval each time, about 95% of those intervals would contain the true population proportion. It does not mean there's a 95% probability that pp falls in your particular interval.

Confidence intervals for proportions, Estimating a Population Proportion (2 of 3) | Statistics for the Social Sciences

Sample size for proportion estimates

Before collecting data, you often need to determine how large your sample should be to achieve a desired margin of error EE. Setting the margin of error formula equal to EE and solving for nn:

n=z2p^(1p^)E2n = \frac{z^{*2}\hat{p}(1-\hat{p})}{E^2}

Here's how to use this in practice:

  1. Choose your desired confidence level and find the corresponding zz^*.
  2. Choose your desired margin of error EE (e.g., 0.03 for ± 3%).
  3. Plug in an estimate for p^\hat{p}. If you have a prior study or pilot data, use that value. If you have no prior estimate, use p^=0.5\hat{p} = 0.5. This is the conservative choice because 0.5×0.5=0.250.5 \times 0.5 = 0.25 is the maximum value of p^(1p^)\hat{p}(1-\hat{p}), so it guarantees your sample will be large enough.
  4. Always round up to the next whole number. If you get n=1067.1n = 1067.1, you need 1068 people.

Example: You want a 95% confidence interval with a margin of error of 4%, and you have no prior estimate of p^\hat{p}.

n=(1.96)2(0.5)(0.5)(0.04)2=(3.8416)(0.25)0.0016=0.96040.0016=600.25n = \frac{(1.96)^2(0.5)(0.5)}{(0.04)^2} = \frac{(3.8416)(0.25)}{0.0016} = \frac{0.9604}{0.0016} = 600.25

You'd need a sample of at least 601 people.

Finite population correction: If your population size NN is known and your calculated sample size nn is more than 5% of NN, apply this adjustment:

nadj=n1+n1Nn_{\text{adj}} = \frac{n}{1 + \frac{n-1}{N}}

This reduces the required sample size because sampling a large fraction of a finite population gives you more information than the standard formula accounts for. For instance, if N=2000N = 2000 and your initial calculation gives n=601n = 601:

nadj=6011+6002000=6011.3462.3n_{\text{adj}} = \frac{601}{1 + \frac{600}{2000}} = \frac{601}{1.3} \approx 462.3

So you'd only need 463 people.

Statistical Inference and Hypothesis Testing

Confidence intervals are one form of statistical inference, which means using sample data to draw conclusions about a population. The other major form is hypothesis testing, where you test a specific claim about a population parameter.

In hypothesis testing for proportions, you formulate a null hypothesis (e.g., H0:p=0.5H_0: p = 0.5) and an alternative hypothesis, then use sample data to calculate a test statistic and p-value. The p-value tells you how likely your observed result (or something more extreme) would be if the null hypothesis were true.

Confidence intervals and hypothesis tests are closely related: if a hypothesized value of pp falls outside your 95% confidence interval, you would reject that value at the α=0.05\alpha = 0.05 significance level.