Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Confidence intervals are the bridge between your sample data and the population you actually care about. Every time you calculate a CI, you're quantifying uncertainty: "here's my best estimate, and here's how much I trust it." This connects directly to core inference concepts like sampling distributions, standard error, degrees of freedom, and the trade-off between precision and confidence level.
The thing exams really test isn't whether you can plug numbers into formulas. It's whether you understand which formula to use and why. Different scenarios call for different distributions (Z, t, chi-squared, F) based on what you know about your population and what parameter you're estimating. So beyond memorizing formulas, focus on what assumptions each interval requires and what happens when those assumptions break down.
When you're estimating a population mean from sample data, your choice of distribution hinges on one question: do you know the true population standard deviation, or are you estimating it from your sample?
This is the rare case where you actually know the population standard deviation, so the sampling distribution of is exactly normal. You use the Z-distribution.
This is the far more common scenario. Substituting the sample standard deviation for introduces extra uncertainty, and the t-distribution accounts for that with fatter tails.
Compare: Known vs. Unknown . Both estimate the population mean, but unknown uses the t-distribution with fatter tails to account for estimating variability. If a problem gives you instead of , that's your cue to use t.
Proportion intervals rely on the normal approximation to the binomial distribution. This works when your sample is large enough that the sampling distribution of is approximately normal.
Two-sample intervals answer the question: is there a real difference between these groups, or could sampling variability explain what we see? These show up constantly in clinical trials, A/B testing, and experimental design.
Formula (known variances):
When variances are unknown (the usual case), replace with and use the t-distribution.
Independence assumption is critical. The two samples must be drawn independently. If the same subjects appear in both groups (e.g., before/after measurements), you need a paired approach instead.
If the interval contains zero, you cannot conclude the means differ at that confidence level. This directly parallels a two-sided hypothesis test: containing zero is equivalent to failing to reject .
Compare: Two-mean difference vs. Two-proportion difference. Both use similar additive variance structures (you add the variances from each group under the square root). The difference is that proportions use for variance while means use or . On exams, they're testing whether you recognize the parameter type and pick the right variance term.
Sometimes you care about spread rather than center. Variance intervals use distributions specifically designed for squared quantities.
The quantity follows a chi-squared distribution with degrees of freedom. That's the theoretical basis for this interval.
The ratio of two independent chi-squared variables (each divided by their degrees of freedom) follows an F-distribution. This lets you compare variability across two groups.
Compare: Chi-squared (single variance) vs. F-distribution (variance ratio). Chi-squared handles one sample's variance; F handles comparisons between two. Both assume normality, and both produce asymmetric intervals because the underlying distributions are right-skewed.
When you move beyond location and spread into relationships between variables, you need intervals for correlation and regression coefficients.
The sampling distribution of is skewed, especially when the true correlation is near . Fisher's z-transformation fixes this by mapping onto a scale where the sampling distribution is approximately normal.
Here's the process:
The transformation stabilizes the variance so that the standard error is approximately regardless of the true correlation value.
Compare: Correlation CI vs. Regression coefficient CI. Correlation measures the strength of linear association and is bounded between and . A regression coefficient measures the change in Y per unit change in X and is unbounded. Both assess relationships, but regression gives you predictive power and a scale-dependent interpretation.
When your data violates distributional assumptions or you're estimating a statistic without a known theoretical distribution, traditional formulas may not apply. That's where resampling methods come in.
The bootstrap works by resampling your observed data with replacement thousands of times. Each resample gives you a new estimate of your statistic, and together they build an empirical sampling distribution.
Compare: Traditional parametric CIs vs. Bootstrap CIs. Parametric methods are more statistically efficient when their assumptions hold, but bootstrap is more flexible. For small samples with non-normal data, or for statistics like the median where no clean formula exists, bootstrap is often your best option.
| Concept | Best Examples |
|---|---|
| Z-distribution intervals | Known mean, proportions, two-proportion difference |
| t-distribution intervals | Unknown mean, regression coefficients, two-mean difference |
| Chi-squared intervals | Single population variance |
| F-distribution intervals | Ratio of two variances |
| Transformation-based | Correlation coefficient (Fisher's z) |
| Non-parametric methods | Bootstrap (any parameter, no assumptions) |
| Two-sample comparisons | Difference in means, difference in proportions, variance ratio |
| Regression inference | Coefficient CIs, predictor significance |
You're estimating a population mean from a sample of 25 observations, and you calculated the standard deviation from your data. Which distribution do you use, and why does it matter?
Compare and contrast the confidence interval for a single proportion versus the difference between two proportions. What's similar about their structures, and what additional consideration applies to the two-sample case?
Both chi-squared and F-distributions are used for variance-related intervals. When would you use each, and what assumption do they share that makes them sensitive to violations?
A colleague builds a 95% CI for a regression slope and finds it includes zero. Another builds a 95% CI for the correlation between the same two variables and finds it excludes zero. Is this possible? What might explain this?
When would you choose bootstrap confidence intervals over traditional parametric methods? Give two specific scenarios where bootstrap would be the better choice.