Confidence Intervals for Population Proportions
Confidence intervals for population proportions let you estimate the true percentage of a characteristic in a population using sample data. Instead of reporting a single number, you calculate a range that likely contains the actual proportion, which gives you a built-in measure of how uncertain your estimate is.
Confidence Intervals for Proportions
A confidence interval for a proportion gives you a range of plausible values for the true population proportion (the parameter you can't directly measure) at a specified confidence level, such as 90%, 95%, or 99%.
To build one, you need three ingredients:
- Sample proportion (): the proportion of successes in your sample
- Sample size (): how many observations you collected
- Critical value (): the z-score that corresponds to your chosen confidence level (1.645 for 90%, 1.96 for 95%, 2.576 for 99%)
The formula is:
The piece under the square root, , is the standard error of the sample proportion. It measures how much would typically vary from sample to sample. The entire portion is the margin of error.
Before you use this formula, three conditions must be met:
- Random sample: The data must come from a random sampling method so results generalize to the population.
- Independence (10% condition): The population must be at least 10 times larger than the sample (), so that sampling without replacement doesn't distort things.
- Success-failure condition: Both and . This ensures the sampling distribution of is approximately normal, which is what justifies using a z-score.

Interpretation of Proportion Intervals
Getting the right interpretation matters a lot on exams. Suppose a 95% confidence interval for voter support of a candidate is (0.52, 0.58). The correct interpretation is: We are 95% confident that the true proportion of voters who support this candidate is between 0.52 and 0.58.
What "95% confident" really means: if you repeated this sampling process many times and built an interval each time, about 95% of those intervals would capture the true proportion. It does not mean there's a 95% probability the true proportion is in this specific interval.
A few practical takeaways about interval width:
- A narrow interval means your estimate is more precise (usually from a large sample or a proportion near 0 or 1).
- A wide interval means more uncertainty (usually from a small sample or a proportion near 0.5).
- If confidence intervals for two groups overlap substantially, that suggests the difference between them may not be statistically significant.

Sample Size for Proportion Estimates
Sometimes you need to plan ahead: How large a sample do I need to get a margin of error no bigger than some target? The formula for the required sample size is:
where is your desired margin of error and is your best guess at the population proportion.
Steps to use this:
- Choose your confidence level and find the corresponding .
- Set your desired margin of error (for example, 0.03 for ±3%).
- Plug in an estimate for . If you have no prior information, use , which is the most conservative choice because is maximized at 0.5, giving you the largest (safest) sample size.
- Calculate and always round up to the next whole number. Rounding down would give you a margin of error slightly larger than your target.
For example, to estimate a proportion within ±4% at 95% confidence with no prior estimate: , so you'd need .
Connection to Statistical Inference and Hypothesis Testing
Confidence intervals are one of two main tools in statistical inference (the other being hypothesis testing). Both use sample data to draw conclusions about a population, but they approach the question differently.
- A confidence interval estimates what the parameter might be by giving a range of plausible values.
- A hypothesis test asks whether the parameter equals a specific value (the null hypothesis) and uses a significance level () as the threshold for rejecting that claim.
These tools are connected: if a hypothesized value falls outside your confidence interval, you would reject the null hypothesis at the corresponding significance level. For instance, if a 95% confidence interval for a proportion is (0.52, 0.58), you would reject at because 0.50 is not in the interval.