Confidence intervals are the bridge between your sample data and the population you actually care aboutโand that's exactly what you're being tested on. Every time you calculate a CI, you're quantifying uncertainty, telling your audience "here's my best estimate, and here's how much I trust it." This connects directly to core statistical inference concepts: sampling distributions, standard error, degrees of freedom, and the trade-off between precision and confidence level.
The key insight exams test isn't whether you can plug numbers into formulasโit's whether you understand which formula to use and why. Different scenarios demand different distributions (Z, t, chi-squared, F) based on what you know about your population and what parameter you're estimating. Don't just memorize the formulas; know what assumptions each interval requires and what happens when those assumptions break down.
Estimating Single Population Means
When you're estimating a population mean from sample data, your choice of distribution depends entirely on one question: do you know the true population standard deviation, or are you estimating it from your sample?
Confidence Interval for Population Mean (Known ฯ)
Uses the Z-distributionโthis is the rare case where you actually know the population standard deviation, making the sampling distribution exactly normal
Formula:xหยฑZฮฑ/2โ(nโฯโ), where xห is the sample mean and n is sample size
Interval width shrinks with larger nโthe nโ in the denominator means quadrupling your sample size cuts the margin of error in half
Confidence Interval for Population Mean (Unknown ฯ)
Uses the t-distributionโsubstituting the sample standard deviation s for ฯ introduces extra uncertainty that the t-distribution captures
Formula:xหยฑtฮฑ/2,nโ1โ(nโsโ), with degrees of freedom df=nโ1
Converges to Z as n increasesโfor large samples (roughly n>30), the t and Z distributions become nearly identical
Compare: Known ฯ vs. Unknown ฯโboth estimate the population mean, but unknown ฯ uses t-distribution with fatter tails to account for estimating variability. If an FRQ gives you s instead of ฯ, that's your cue to use t.
Estimating Proportions
Proportion intervals rely on the normal approximation to the binomial distribution, which works when your sample is large enough that the sampling distribution of p^โ is approximately normal.
Confidence Interval for Population Proportion
Formula:p^โยฑZฮฑ/2โnp^โ(1โp^โ)โโ, where p^โ is your sample proportion
Requires large sample verificationโcheck that np^โโฅ10 and n(1โp^โ)โฅ10 for the normal approximation to hold
Standard error depends on p^โโnotice the SE is maximized when p^โ=0.5, which is why polls often assume 50/50 splits for sample size calculations
Comparing Two Groups
Two-sample intervals answer the question: is there a real difference, or could sampling variability explain what we see? These are workhorses in A/B testing, clinical trials, and experimental design.
Confidence Interval for Difference Between Two Means
Formula (known variances):(xห1โโxห2โ)ยฑZฮฑ/2โn1โฯ12โโ+n2โฯ22โโโ; use t-distribution when variances are unknown
Independence assumption is criticalโthe samples must be drawn independently; paired data requires a different approach
If interval contains zeroโyou cannot conclude the means differ at that confidence level; this directly connects to hypothesis testing
Confidence Interval for Difference Between Two Proportions
Each group needs sufficient successes and failuresโverify normal approximation conditions for both samples separately
Common in A/B testing scenariosโcomparing conversion rates, treatment success rates, or any binary outcome across groups
Compare: Two-mean difference vs. Two-proportion differenceโboth use similar additive variance structures, but proportions use p^โ(1โp^โ) for variance while means use ฯ2 or s2. Watch the formula structureโthey're testing whether you recognize the parameter type.
Estimating Variability
Sometimes you care about spread rather than center. Variance and ratio-of-variance intervals use distributions specifically designed for squared quantities.
Confidence Interval for Population Variance
Uses the chi-squared distributionโbecause (nโ1)s2/ฯ2 follows a chi-squared distribution with nโ1 degrees of freedom
Tests equality of variancesโif the interval contains 1, you cannot conclude the variances differ; this matters for choosing pooled vs. unpooled t-tests
Compare: Chi-squared (single variance) vs. F-distribution (variance ratio)โchi-squared handles one sample's variance, while F handles comparisons. Both assume normality, and both are asymmetric distributions, making these intervals asymmetric around the point estimate.
Relationship and Model Parameters
When you move beyond simple location and spread into relationships between variables, you need intervals for correlation and regression coefficients.
Confidence Interval for Correlation Coefficient
Uses Fisher's z-transformationโbecause the sampling distribution of r is skewed, especially near ยฑ1
Transform, build interval, back-transform:zโฒ=21โln(1โr1+rโ), then zโฒยฑZฮฑ/2โโ nโ3โ1โ
Stabilizes variance across all r valuesโthe transformation makes the standard error approximately constant regardless of the true correlation
Confidence Interval for Regression Coefficients
Formula:ฮฒ^โยฑtฮฑ/2,nโkโโ SE(ฮฒ^โ), where k is the number of parameters estimated
Directly tests predictor significanceโif the interval for a slope excludes zero, that predictor has a statistically significant linear relationship with the response
Assumes standard regression conditionsโlinearity, independence, homoscedasticity, and normally distributed errors (LINE assumptions)
Compare: Correlation CI vs. Regression coefficient CIโcorrelation measures strength of linear association (bounded โ1 to 1), while regression coefficients measure the change in Y per unit change in X. Both assess relationships, but regression gives you predictive power.
Non-Parametric Approaches
When your data violates assumptions or you're estimating complex quantities, traditional formulas may fail. That's where resampling methods shine.
Bootstrap Confidence Intervals
Resamples your data with replacementโcreates thousands of "pseudo-samples" to empirically build the sampling distribution
No distributional assumptions requiredโworks for medians, ratios, or any statistic where theoretical distributions are unknown or intractable
Multiple methods existโpercentile method (simplest), BCa (bias-corrected and accelerated), and basic bootstrap each handle bias and skewness differently
Compare: Traditional parametric CIs vs. Bootstrap CIsโparametric methods are more efficient when assumptions hold, but bootstrap is more flexible and robust. For small samples or weird statistics, bootstrap is often your best option.
Quick Reference Table
Concept
Best Examples
Z-distribution intervals
Known ฯ mean, proportions, two-proportion difference
Difference in means, difference in proportions, variance ratio
Regression inference
Coefficient CIs, predictor significance
Self-Check Questions
You're estimating a population mean from a sample of 25 observations, and you calculated the standard deviation from your data. Which distribution do you use, and why does it matter?
Compare and contrast the confidence interval for a single proportion versus the difference between two proportions. What's similar about their structures, and what additional consideration applies to the two-sample case?
Both chi-squared and F-distributions are used for variance-related intervals. When would you use each, and what assumption do they share that makes them sensitive to violations?
A colleague builds a 95% CI for a regression slope and finds it includes zero. Another builds a 95% CI for the correlation between the same two variables and finds it excludes zero. Is this possible? What might explain this?
When would you choose bootstrap confidence intervals over traditional parametric methods? Give two specific scenarios where bootstrap would be the better choice.