Confidence intervals are the bridge between your sample data and the population you actually care about—and that's exactly what you're being tested on. Every time you calculate a CI, you're quantifying uncertainty, telling your audience "here's my best estimate, and here's how much I trust it." This connects directly to core statistical inference concepts: sampling distributions, standard error, degrees of freedom, and the trade-off between precision and confidence level.
The key insight exams test isn't whether you can plug numbers into formulas—it's whether you understand which formula to use and why. Different scenarios demand different distributions (Z, t, chi-squared, F) based on what you know about your population and what parameter you're estimating. Don't just memorize the formulas; know what assumptions each interval requires and what happens when those assumptions break down.
Estimating Single Population Means
When you're estimating a population mean from sample data, your choice of distribution depends entirely on one question: do you know the true population standard deviation, or are you estimating it from your sample?
Confidence Interval for Population Mean (Known σ)
Uses the Z-distribution—this is the rare case where you actually know the population standard deviation, making the sampling distribution exactly normal
Formula:xˉ±Zα/2(nσ), where xˉ is the sample mean and n is sample size
Interval width shrinks with larger n—the n in the denominator means quadrupling your sample size cuts the margin of error in half
Confidence Interval for Population Mean (Unknown σ)
Uses the t-distribution—substituting the sample standard deviation s for σ introduces extra uncertainty that the t-distribution captures
Formula:xˉ±tα/2,n−1(ns), with degrees of freedom df=n−1
Converges to Z as n increases—for large samples (roughly n>30), the t and Z distributions become nearly identical
Compare: Known σ vs. Unknown σ—both estimate the population mean, but unknown σ uses t-distribution with fatter tails to account for estimating variability. If an FRQ gives you s instead of σ, that's your cue to use t.
Estimating Proportions
Proportion intervals rely on the normal approximation to the binomial distribution, which works when your sample is large enough that the sampling distribution of p^ is approximately normal.
Confidence Interval for Population Proportion
Formula:p^±Zα/2np^(1−p^), where p^ is your sample proportion
Requires large sample verification—check that np^≥10 and n(1−p^)≥10 for the normal approximation to hold
Standard error depends on p^—notice the SE is maximized when p^=0.5, which is why polls often assume 50/50 splits for sample size calculations
Comparing Two Groups
Two-sample intervals answer the question: is there a real difference, or could sampling variability explain what we see? These are workhorses in A/B testing, clinical trials, and experimental design.
Confidence Interval for Difference Between Two Means
Formula (known variances):(xˉ1−xˉ2)±Zα/2n1σ12+n2σ22; use t-distribution when variances are unknown
Independence assumption is critical—the samples must be drawn independently; paired data requires a different approach
If interval contains zero—you cannot conclude the means differ at that confidence level; this directly connects to hypothesis testing
Confidence Interval for Difference Between Two Proportions
Each group needs sufficient successes and failures—verify normal approximation conditions for both samples separately
Common in A/B testing scenarios—comparing conversion rates, treatment success rates, or any binary outcome across groups
Compare: Two-mean difference vs. Two-proportion difference—both use similar additive variance structures, but proportions use p^(1−p^) for variance while means use σ2 or s2. Watch the formula structure—they're testing whether you recognize the parameter type.
Estimating Variability
Sometimes you care about spread rather than center. Variance and ratio-of-variance intervals use distributions specifically designed for squared quantities.
Confidence Interval for Population Variance
Uses the chi-squared distribution—because (n−1)s2/σ2 follows a chi-squared distribution with n−1 degrees of freedom
Formula:(χα/2,n−12(n−1)s2,χ1−α/2,n−12(n−1)s2)
Highly sensitive to normality—this interval is less robust than mean intervals; non-normal data can severely distort results
Confidence Interval for Ratio of Two Variances
Uses the F-distribution—the ratio of two independent chi-squared variables (each divided by their df) follows an F-distribution
Tests equality of variances—if the interval contains 1, you cannot conclude the variances differ; this matters for choosing pooled vs. unpooled t-tests
Compare: Chi-squared (single variance) vs. F-distribution (variance ratio)—chi-squared handles one sample's variance, while F handles comparisons. Both assume normality, and both are asymmetric distributions, making these intervals asymmetric around the point estimate.
Relationship and Model Parameters
When you move beyond simple location and spread into relationships between variables, you need intervals for correlation and regression coefficients.
Confidence Interval for Correlation Coefficient
Uses Fisher's z-transformation—because the sampling distribution of r is skewed, especially near ±1
Transform, build interval, back-transform:z′=21ln(1−r1+r), then z′±Zα/2⋅n−31
Stabilizes variance across all r values—the transformation makes the standard error approximately constant regardless of the true correlation
Confidence Interval for Regression Coefficients
Formula:β^±tα/2,n−k⋅SE(β^), where k is the number of parameters estimated
Directly tests predictor significance—if the interval for a slope excludes zero, that predictor has a statistically significant linear relationship with the response
Assumes standard regression conditions—linearity, independence, homoscedasticity, and normally distributed errors (LINE assumptions)
Compare: Correlation CI vs. Regression coefficient CI—correlation measures strength of linear association (bounded −1 to 1), while regression coefficients measure the change in Y per unit change in X. Both assess relationships, but regression gives you predictive power.
Non-Parametric Approaches
When your data violates assumptions or you're estimating complex quantities, traditional formulas may fail. That's where resampling methods shine.
Bootstrap Confidence Intervals
Resamples your data with replacement—creates thousands of "pseudo-samples" to empirically build the sampling distribution
No distributional assumptions required—works for medians, ratios, or any statistic where theoretical distributions are unknown or intractable
Multiple methods exist—percentile method (simplest), BCa (bias-corrected and accelerated), and basic bootstrap each handle bias and skewness differently
Compare: Traditional parametric CIs vs. Bootstrap CIs—parametric methods are more efficient when assumptions hold, but bootstrap is more flexible and robust. For small samples or weird statistics, bootstrap is often your best option.
Quick Reference Table
Concept
Best Examples
Z-distribution intervals
Known σ mean, proportions, two-proportion difference
Difference in means, difference in proportions, variance ratio
Regression inference
Coefficient CIs, predictor significance
Self-Check Questions
You're estimating a population mean from a sample of 25 observations, and you calculated the standard deviation from your data. Which distribution do you use, and why does it matter?
Compare and contrast the confidence interval for a single proportion versus the difference between two proportions. What's similar about their structures, and what additional consideration applies to the two-sample case?
Both chi-squared and F-distributions are used for variance-related intervals. When would you use each, and what assumption do they share that makes them sensitive to violations?
A colleague builds a 95% CI for a regression slope and finds it includes zero. Another builds a 95% CI for the correlation between the same two variables and finds it excludes zero. Is this possible? What might explain this?
When would you choose bootstrap confidence intervals over traditional parametric methods? Give two specific scenarios where bootstrap would be the better choice.