upgrade
upgrade

🎲Data Science Statistics

Confidence Interval Calculations

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Confidence intervals are the bridge between your sample data and the population you actually care about—and that's exactly what you're being tested on. Every time you calculate a CI, you're quantifying uncertainty, telling your audience "here's my best estimate, and here's how much I trust it." This connects directly to core statistical inference concepts: sampling distributions, standard error, degrees of freedom, and the trade-off between precision and confidence level.

The key insight exams test isn't whether you can plug numbers into formulas—it's whether you understand which formula to use and why. Different scenarios demand different distributions (Z, t, chi-squared, F) based on what you know about your population and what parameter you're estimating. Don't just memorize the formulas; know what assumptions each interval requires and what happens when those assumptions break down.


Estimating Single Population Means

When you're estimating a population mean from sample data, your choice of distribution depends entirely on one question: do you know the true population standard deviation, or are you estimating it from your sample?

Confidence Interval for Population Mean (Known σ\sigma)

  • Uses the Z-distribution—this is the rare case where you actually know the population standard deviation, making the sampling distribution exactly normal
  • Formula: xˉ±Zα/2(σn)\bar{x} \pm Z_{\alpha/2} \left(\frac{\sigma}{\sqrt{n}}\right), where xˉ\bar{x} is the sample mean and nn is sample size
  • Interval width shrinks with larger nn—the n\sqrt{n} in the denominator means quadrupling your sample size cuts the margin of error in half

Confidence Interval for Population Mean (Unknown σ\sigma)

  • Uses the t-distribution—substituting the sample standard deviation ss for σ\sigma introduces extra uncertainty that the t-distribution captures
  • Formula: xˉ±tα/2,n1(sn)\bar{x} \pm t_{\alpha/2, n-1} \left(\frac{s}{\sqrt{n}}\right), with degrees of freedom df=n1df = n - 1
  • Converges to Z as nn increases—for large samples (roughly n>30n > 30), the t and Z distributions become nearly identical

Compare: Known σ\sigma vs. Unknown σ\sigma—both estimate the population mean, but unknown σ\sigma uses t-distribution with fatter tails to account for estimating variability. If an FRQ gives you ss instead of σ\sigma, that's your cue to use t.


Estimating Proportions

Proportion intervals rely on the normal approximation to the binomial distribution, which works when your sample is large enough that the sampling distribution of p^\hat{p} is approximately normal.

Confidence Interval for Population Proportion

  • Formula: p^±Zα/2p^(1p^)n\hat{p} \pm Z_{\alpha/2} \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}, where p^\hat{p} is your sample proportion
  • Requires large sample verification—check that np^10n\hat{p} \geq 10 and n(1p^)10n(1-\hat{p}) \geq 10 for the normal approximation to hold
  • Standard error depends on p^\hat{p}—notice the SE is maximized when p^=0.5\hat{p} = 0.5, which is why polls often assume 50/50 splits for sample size calculations

Comparing Two Groups

Two-sample intervals answer the question: is there a real difference, or could sampling variability explain what we see? These are workhorses in A/B testing, clinical trials, and experimental design.

Confidence Interval for Difference Between Two Means

  • Formula (known variances): (xˉ1xˉ2)±Zα/2σ12n1+σ22n2(\bar{x}_1 - \bar{x}_2) \pm Z_{\alpha/2} \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}; use t-distribution when variances are unknown
  • Independence assumption is critical—the samples must be drawn independently; paired data requires a different approach
  • If interval contains zero—you cannot conclude the means differ at that confidence level; this directly connects to hypothesis testing

Confidence Interval for Difference Between Two Proportions

  • Formula: (p^1p^2)±Zα/2p^1(1p^1)n1+p^2(1p^2)n2(\hat{p}_1 - \hat{p}_2) \pm Z_{\alpha/2} \sqrt{\frac{\hat{p}_1(1 - \hat{p}_1)}{n_1} + \frac{\hat{p}_2(1 - \hat{p}_2)}{n_2}}
  • Each group needs sufficient successes and failures—verify normal approximation conditions for both samples separately
  • Common in A/B testing scenarios—comparing conversion rates, treatment success rates, or any binary outcome across groups

Compare: Two-mean difference vs. Two-proportion difference—both use similar additive variance structures, but proportions use p^(1p^)\hat{p}(1-\hat{p}) for variance while means use σ2\sigma^2 or s2s^2. Watch the formula structure—they're testing whether you recognize the parameter type.


Estimating Variability

Sometimes you care about spread rather than center. Variance and ratio-of-variance intervals use distributions specifically designed for squared quantities.

Confidence Interval for Population Variance

  • Uses the chi-squared distribution—because (n1)s2/σ2(n-1)s^2/\sigma^2 follows a chi-squared distribution with n1n-1 degrees of freedom
  • Formula: ((n1)s2χα/2,n12,(n1)s2χ1α/2,n12)\left(\frac{(n-1)s^2}{\chi^2_{\alpha/2, n-1}}, \frac{(n-1)s^2}{\chi^2_{1-\alpha/2, n-1}}\right)
  • Highly sensitive to normality—this interval is less robust than mean intervals; non-normal data can severely distort results

Confidence Interval for Ratio of Two Variances

  • Uses the F-distribution—the ratio of two independent chi-squared variables (each divided by their df) follows an F-distribution
  • Formula: (s12s221Fα/2,n11,n21,s12s22F1α/2,n11,n21)\left(\frac{s_1^2}{s_2^2} \cdot \frac{1}{F_{\alpha/2, n_1-1, n_2-1}}, \frac{s_1^2}{s_2^2} \cdot F_{1-\alpha/2, n_1-1, n_2-1}\right)
  • Tests equality of variances—if the interval contains 1, you cannot conclude the variances differ; this matters for choosing pooled vs. unpooled t-tests

Compare: Chi-squared (single variance) vs. F-distribution (variance ratio)—chi-squared handles one sample's variance, while F handles comparisons. Both assume normality, and both are asymmetric distributions, making these intervals asymmetric around the point estimate.


Relationship and Model Parameters

When you move beyond simple location and spread into relationships between variables, you need intervals for correlation and regression coefficients.

Confidence Interval for Correlation Coefficient

  • Uses Fisher's z-transformation—because the sampling distribution of rr is skewed, especially near ±1\pm 1
  • Transform, build interval, back-transform: z=12ln(1+r1r)z' = \frac{1}{2} \ln\left(\frac{1+r}{1-r}\right), then z±Zα/21n3z' \pm Z_{\alpha/2} \cdot \frac{1}{\sqrt{n-3}}
  • Stabilizes variance across all rr values—the transformation makes the standard error approximately constant regardless of the true correlation

Confidence Interval for Regression Coefficients

  • Formula: β^±tα/2,nkSE(β^)\hat{\beta} \pm t_{\alpha/2, n-k} \cdot SE(\hat{\beta}), where kk is the number of parameters estimated
  • Directly tests predictor significance—if the interval for a slope excludes zero, that predictor has a statistically significant linear relationship with the response
  • Assumes standard regression conditions—linearity, independence, homoscedasticity, and normally distributed errors (LINE assumptions)

Compare: Correlation CI vs. Regression coefficient CI—correlation measures strength of linear association (bounded 1-1 to 11), while regression coefficients measure the change in Y per unit change in X. Both assess relationships, but regression gives you predictive power.


Non-Parametric Approaches

When your data violates assumptions or you're estimating complex quantities, traditional formulas may fail. That's where resampling methods shine.

Bootstrap Confidence Intervals

  • Resamples your data with replacement—creates thousands of "pseudo-samples" to empirically build the sampling distribution
  • No distributional assumptions required—works for medians, ratios, or any statistic where theoretical distributions are unknown or intractable
  • Multiple methods exist—percentile method (simplest), BCa (bias-corrected and accelerated), and basic bootstrap each handle bias and skewness differently

Compare: Traditional parametric CIs vs. Bootstrap CIs—parametric methods are more efficient when assumptions hold, but bootstrap is more flexible and robust. For small samples or weird statistics, bootstrap is often your best option.


Quick Reference Table

ConceptBest Examples
Z-distribution intervalsKnown σ\sigma mean, proportions, two-proportion difference
t-distribution intervalsUnknown σ\sigma mean, regression coefficients, two-mean difference
Chi-squared intervalsSingle population variance
F-distribution intervalsRatio of two variances
Transformation-basedCorrelation coefficient (Fisher's z)
Non-parametric methodsBootstrap (any parameter, no assumptions)
Two-sample comparisonsDifference in means, difference in proportions, variance ratio
Regression inferenceCoefficient CIs, predictor significance

Self-Check Questions

  1. You're estimating a population mean from a sample of 25 observations, and you calculated the standard deviation from your data. Which distribution do you use, and why does it matter?

  2. Compare and contrast the confidence interval for a single proportion versus the difference between two proportions. What's similar about their structures, and what additional consideration applies to the two-sample case?

  3. Both chi-squared and F-distributions are used for variance-related intervals. When would you use each, and what assumption do they share that makes them sensitive to violations?

  4. A colleague builds a 95% CI for a regression slope and finds it includes zero. Another builds a 95% CI for the correlation between the same two variables and finds it excludes zero. Is this possible? What might explain this?

  5. When would you choose bootstrap confidence intervals over traditional parametric methods? Give two specific scenarios where bootstrap would be the better choice.