When you collect sample data, you rarely know the true population proportion $p$ . A confidence interval gives you a range of plausible values for $p$ based on what you observed in your sample.

The formula for a confidence interval for a population proportion:

$\hat{p} \pm z^* \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$

$\hat{p}$ is the sample proportion (number of successes divided by $n$ )
$z^*$ is the critical value from the standard normal distribution, determined by your confidence level
$n$ is the sample size

Common critical values you should know:

Confidence Level	$z^*$
90%	1.645
95%	1.96
99%	2.576

Conditions that must be met before using this formula:

Random sample: The data must come from a random sampling method.
Independence (10% condition): The population must be at least 10 times larger than the sample size ( $N \geq 10n$ ). This ensures that sampling without replacement doesn't meaningfully affect the results.
Success-failure condition (Large Counts): $n\hat{p} \geq 10$ and $n(1-\hat{p}) \geq 10$ . This is the condition that actually justifies using the normal approximation. A generic " $n \geq 30$ " rule does not apply here; what matters is having enough successes and enough failures in your sample.

The Central Limit Theorem is what makes this work: when these conditions are satisfied, the sampling distribution of $\hat{p}$ is approximately normal, centered at $p$ with standard deviation $\sqrt{\frac{p(1-p)}{n}}$ .

Confidence intervals for proportions, 8.3: Confidence Interval for Population Proportion - Statistics LibreTexts

Interpretation of margin of error

The margin of error (ME) is the " $\pm$ " part of your confidence interval:

$ME = z^* \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$

It tells you how far your sample proportion could reasonably be from the true population proportion. A smaller margin of error means a more precise estimate. For example, a poll reporting "52% ± 2%" is much more useful than one reporting "52% ± 8%."

The standard error of the sample proportion is the piece inside the margin of error without the $z^*$ :

$SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$

This measures the typical amount that $\hat{p}$ varies from sample to sample. The margin of error is just the standard error scaled up by $z^*$ .

Three factors control the width of your interval:

Confidence level: Higher confidence means a larger $z^*$ , which widens the interval. You're casting a wider net to be more sure you've captured $p$ .
Sample size: Larger $n$ shrinks the standard error, narrowing the interval. Notice that $n$ is under a square root, so you need to quadruple the sample size to cut the margin of error in half.
Sample proportion: The product $\hat{p}(1-\hat{p})$ is largest when $\hat{p} = 0.5$ , so proportions near 50% produce the widest intervals.

Interpreting a confidence level correctly: A 95% confidence level means that if you repeated the sampling process many times and built a confidence interval each time, about 95% of those intervals would contain the true population proportion. It does not mean there's a 95% probability that $p$ falls in your particular interval.

Confidence intervals for proportions, Estimating a Population Proportion (2 of 3) | Statistics for the Social Sciences

Sample size for proportion estimates

Before collecting data, you often need to determine how large your sample should be to achieve a desired margin of error $E$ . Setting the margin of error formula equal to $E$ and solving for $n$ :

$n = \frac{z^{*2}\hat{p}(1-\hat{p})}{E^2}$

Here's how to use this in practice:

Choose your desired confidence level and find the corresponding $z^*$ .
Choose your desired margin of error $E$ (e.g., 0.03 for ± 3%).
Plug in an estimate for $\hat{p}$ . If you have a prior study or pilot data, use that value. If you have no prior estimate, use $\hat{p} = 0.5$ . This is the conservative choice because $0.5 \times 0.5 = 0.25$ is the maximum value of $\hat{p}(1-\hat{p})$ , so it guarantees your sample will be large enough.
Always round up to the next whole number. If you get $n = 1067.1$ , you need 1068 people.

Example: You want a 95% confidence interval with a margin of error of 4%, and you have no prior estimate of $\hat{p}$ .

$n = \frac{(1.96)^2(0.5)(0.5)}{(0.04)^2} = \frac{(3.8416)(0.25)}{0.0016} = \frac{0.9604}{0.0016} = 600.25$

You'd need a sample of at least 601 people.

Finite population correction: If your population size $N$ is known and your calculated sample size $n$ is more than 5% of $N$ , apply this adjustment:

$n_{\text{adj}} = \frac{n}{1 + \frac{n-1}{N}}$

This reduces the required sample size because sampling a large fraction of a finite population gives you more information than the standard formula accounts for. For instance, if $N = 2000$ and your initial calculation gives $n = 601$ :

$n_{\text{adj}} = \frac{601}{1 + \frac{600}{2000}} = \frac{601}{1.3} \approx 462.3$

So you'd only need 463 people.

Statistical Inference and Hypothesis Testing

Confidence intervals are one form of statistical inference, which means using sample data to draw conclusions about a population. The other major form is hypothesis testing, where you test a specific claim about a population parameter.

In hypothesis testing for proportions, you formulate a null hypothesis (e.g., $H_0: p = 0.5$ ) and an alternative hypothesis, then use sample data to calculate a test statistic and p-value. The p-value tells you how likely your observed result (or something more extreme) would be if the null hypothesis were true.

Confidence intervals and hypothesis tests are closely related: if a hypothesized value of $p$ falls outside your 95% confidence interval, you would reject that value at the $\alpha = 0.05$ significance level.