Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Confidence intervals let you quantify how uncertain a sample-based estimate really is. Every time you estimate a parameter from data (a mean strength, a failure rate, a proportion of defective parts), you need to communicate how much that estimate could vary. CIs do exactly that, connecting directly to concepts like sampling distributions, the Central Limit Theorem, and hypothesis testing.
Don't just memorize the formulas. Know when each interval type applies and why the mechanics work. Can you explain why larger samples shrink your interval? Why you switch from Z to t? Why a 99% CI is wider than a 95% CI? These conceptual questions show up constantly, and the formulas alone won't save you. Master the underlying logic: confidence level trade-offs, sample size effects, distributional assumptions, and the duality between CIs and hypothesis tests.
Before diving into specific interval formulas, you need a solid understanding of what a confidence interval actually means and what controls its width. The confidence level determines how often the procedure captures the true parameter; the margin of error determines how precise your estimate is.
A confidence interval is a range of plausible values for an unknown population parameter, constructed from sample data. It is not a probability statement about where the parameter lies after you've computed it.
The confidence level (e.g., 95%) means that if you repeated the sampling procedure many times, about 95% of the resulting intervals would contain the true parameter. The width of the interval reflects your uncertainty: wider intervals mean less precision, narrower intervals mean your point estimate is more tightly pinned down.
The margin of error (ME) is the "ยฑ" portion of your estimate, equal to half the total interval width. It quantifies the maximum expected sampling error.
ME depends on three factors:
In practice, you'll often specify a required ME first, then solve for the sample size needed to achieve it.
Compare: Confidence level vs. margin of error. Both affect interval width, but confidence level is chosen based on risk tolerance while margin of error reflects data quality. If a problem asks you to "improve precision," think sample size. If it asks about "reliability of the procedure," think confidence level.
The critical decision in constructing any CI is selecting the appropriate sampling distribution. Your choice depends on what you know about the population and how large your sample is.
Use the Z-distribution when (population standard deviation) is known and either the population is normal or (so the CLT applies).
Z critical values are fixed for each confidence level:
This situation is common in quality control settings where historical process data provides a reliable value.
Use the t-distribution when is unknown and you estimate it with the sample standard deviation .
Degrees of freedom () control the shape. Smaller samples produce heavier tails, which makes the interval wider to account for the extra uncertainty in estimating . As , the t-distribution converges to the Z-distribution. Using t is always valid when is unknown, even for large samples.
When you need a CI for a population variance (not a mean), you use the distribution. This works because the quantity follows a chi-squared distribution when the data is normal.
The resulting interval is asymmetric:
Note the "flipped" critical values in the denominators. The normality assumption is critical here; variance intervals are much more sensitive to non-normality than mean intervals.
Compare: Z vs. t intervals. Both estimate population means, but t accounts for the extra uncertainty from estimating . On exams, if is given, use Z. If you see "sample standard deviation" or just , use t.
These are your workhorse formulas. The structure is always the same: point estimate ยฑ (critical value) ร (standard error).
The term is the standard error of the mean. This formula requires the sampling distribution of to be approximately normal, which holds when the population is normal or .
This scenario is rare in practice since is seldom truly known, but it's common on exams to test your understanding of the Z framework.
Here replaces , and the t critical value depends on . Always check your t-table or calculator with the correct degrees of freedom.
This is the default real-world scenario. When in doubt on an exam and isn't explicitly given, use t.
For independent samples with known variances:
For unknown variances (the more common case), use a t-interval (pooled or Welch's):
If the resulting interval contains zero, you cannot conclude the means differ at that confidence level. This connects directly to two-sample hypothesis tests.
Compare: Single mean vs. difference of means. Both use the same logic (point estimate ยฑ ME), but the standard error for differences combines variability from both samples. Exam problems frequently ask whether zero falls in the difference interval.
Proportion intervals apply when your data is categorical (success/failure, defective/non-defective). The normal approximation requires checking sample size conditions before you use these formulas.
where is the sample proportion.
Check conditions before using this formula: and . These ensure the normal approximation to the binomial is reasonable. This interval is used extensively to estimate defect rates, failure probabilities, and compliance percentages.
Both samples must independently satisfy the normal approximation conditions ( and for each group). If zero is in the interval, you cannot conclude the proportions differ.
Compare: Mean intervals vs. proportion intervals. Means use or for standard error, while proportions use . The proportion formula has no separate "known vs. unknown" cases because the variance depends entirely on itself.
Often you work backwards: given a desired precision, how much data do you need? Sample size determination is about controlling the margin of error before collecting data.
Larger shrinks the interval because standard error has in the denominator. But the relationship is a square root, so the returns diminish quickly:
This means there are real resource constraints. You must balance statistical precision against time, cost, and feasibility.
Two-sided intervals (the most common type) bound the parameter above and below: "we're 95% confident is between A and B."
One-sided intervals provide only an upper or lower bound: "we're 95% confident is at least A" or "we're 95% confident is at most B." Use one-sided intervals when the research question is directional, for example, "does the new process reduce defects?" only cares about the lower bound on improvement.
Compare: One-sided vs. two-sided. A one-sided 95% CI uses while a two-sided 95% CI uses . One-sided intervals are narrower but only answer directional questions.
Valid inference requires meeting assumptions, and CIs connect directly to hypothesis testing. Understanding these relationships separates strong exam performance from formula memorization.
Random sampling and independence are non-negotiable. Biased samples produce meaningless intervals regardless of formula correctness.
A two-sided CI corresponds to a two-tailed test at significance level . If the null value falls outside the CI, you reject . If it falls inside, you fail to reject.
This gives you a visual decision rule: construct the CI, then check whether the hypothesized value is inside or outside. CIs also give more information than tests alone because they show the full range of plausible values, not just a binary "reject" or "fail to reject."
Never say "there's a 95% probability the parameter is in this interval." The parameter is fixed; the interval is what's random across repeated samples.
Correct interpretation: "We are 95% confident that [parameter] lies between [lower bound] and [upper bound]."
When reporting, state the confidence level, sample size, assumptions you checked, and what the result means in context.
Compare: CIs vs. hypothesis tests. Both make inferences about parameters, but CIs provide a range of plausible values while tests give binary decisions. If a problem asks you to "test whether " and you've already built a 95% CI, just check if 50 is inside.
| Concept | Best Examples |
|---|---|
| Known (use Z) | CI for mean with historical process data, large-sample proportion intervals |
| Unknown (use t) | CI for mean with sample std dev, small-sample studies |
| Proportion intervals | Defect rate estimation, reliability percentages, A/B testing |
| Difference intervals | Comparing two processes, treatment vs. control, before/after studies |
| Variance/spread | CI for population variance using chi-squared distribution |
| Sample size planning | Margin of error requirements, precision vs. cost trade-offs |
| CI-hypothesis test link | Using CI to make reject/fail-to-reject decisions |
| One-sided vs. two-sided | Directional research questions vs. general estimation |
You're estimating the mean tensile strength of a new alloy. You have specimens and calculate MPa. Should you use a Z or t interval, and why does this choice affect your interval width?
Compare the CI for a single proportion to the CI for the difference between two proportions. What additional complexity arises in the two-sample case, and what conditions must both samples satisfy?
An engineer wants to estimate a defect rate with a margin of error of ยฑ2% at 95% confidence. If a pilot study suggests , approximately what sample size is needed? How would this change if ?
You construct a 95% CI for and obtain . What can you conclude about a hypothesis test of at ? What if the interval were ?
Explain why a 99% confidence interval is wider than a 95% confidence interval for the same data. If you needed to maintain the same width while increasing confidence from 95% to 99%, what would you have to change?