upgrade
upgrade

๐Ÿ“ŠAP Statistics

Confidence Interval Formulas

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Confidence intervals are the backbone of statistical inference on the AP Statistics examโ€”they show up in Units 6, 7, and 9, and you'll encounter them in both multiple-choice questions and FRQs. The College Board wants you to understand that we use intervals (not single values) to estimate population parameters because sample statistics vary from sample to sample. Every confidence interval you construct reflects this fundamental truth: we're acknowledging uncertainty while still making useful claims about populations.

Here's what you're really being tested on: knowing which interval procedure fits which situation, verifying the conditions that make each formula valid, and interpreting your results correctly. The formulas themselves follow a consistent structureโ€”point estimate ยฑ (critical value)(standard error)โ€”but the details change depending on whether you're estimating proportions vs. means, one sample vs. two samples, or categorical vs. quantitative relationships. Don't just memorize formulas; know what type of data and research question each one addresses.


One-Sample Intervals for Proportions

When you have categorical data with a single sample and want to estimate the true population proportion, you'll use the one-sample z-interval. The sampling distribution of p^\hat{p} is approximately normal when sample sizes are large enough, which is why we can use the standard normal (z) distribution.

One-Sample Z-Interval for a Proportion

  • Formula: p^ยฑzโˆ—p^(1โˆ’p^)n\hat{p} \pm z^* \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}โ€”where p^\hat{p} is the sample proportion and zโˆ—z^* is the critical value (e.g., zโˆ—=1.96z^* = 1.96 for 95% confidence)
  • Success-failure condition: both np^โ‰ฅ10n\hat{p} \geq 10 and n(1โˆ’p^)โ‰ฅ10n(1-\hat{p}) \geq 10 must be satisfied to ensure the sampling distribution is approximately normal
  • Independence conditions: data must come from a random sample or randomized experiment, and the 10% condition (nโ‰ค0.10Nn \leq 0.10N) applies when sampling without replacement

One-Sample Intervals for Means

When estimating a population mean from quantitative data, the choice between z and t depends on whether you know the population standard deviation. Spoiler: you almost never know ฯƒ\sigma in real applications, so the t-interval dominates AP Statistics.

Z-Interval for a Mean (Known ฯƒ\sigma)

  • Formula: xห‰ยฑzโˆ—(ฯƒn)\bar{x} \pm z^* \left(\frac{\sigma}{\sqrt{n}}\right)โ€”uses the known population standard deviation ฯƒ\sigma in the standard error
  • Rarely used in practice because knowing ฯƒ\sigma while not knowing ฮผ\mu is an unusual situation; this formula appears mainly in theoretical problems
  • Normal distribution required: either the population is normally distributed, or nโ‰ฅ30n \geq 30 for the Central Limit Theorem to apply

T-Interval for a Mean (Unknown ฯƒ\sigma)

  • Formula: xห‰ยฑtโˆ—(sn)\bar{x} \pm t^* \left(\frac{s}{\sqrt{n}}\right)โ€”substitutes sample standard deviation ss for ฯƒ\sigma, with degrees of freedom df=nโˆ’1df = n - 1
  • T-distribution is wider than the z-distribution, especially for small samples; this accounts for the extra uncertainty from estimating ฯƒ\sigma
  • Conditions: random sample, independence (10% condition), and population approximately normal OR large sample size (nโ‰ฅ30n \geq 30)

Compare: Z-interval vs. T-interval for meansโ€”both estimate ฮผ\mu, but the t-interval uses ss instead of ฯƒ\sigma and has heavier tails. On the AP exam, if ฯƒ\sigma isn't explicitly given, use the t-interval. This is the default for quantitative data.


Two-Sample Intervals for Comparing Groups

Comparing two populations is where inference gets interesting. The key question: are the samples independent (two separate groups) or paired (same subjects measured twice)?

Two-Sample T-Interval for Difference of Means

  • Formula: (xห‰1โˆ’xห‰2)ยฑtโˆ—s12n1+s22n2(\bar{x}_1 - \bar{x}_2) \pm t^* \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}โ€”the standard error combines variability from both samples
  • Degrees of freedom: use calculator's "2-SampTInt" function, which applies the Welch approximation; don't try to compute df by hand on the AP exam
  • Conditions: two independent random samples, 10% condition for each group, and both populations approximately normal OR both sample sizes large

Two-Sample Z-Interval for Difference of Proportions

  • Formula: (p^1โˆ’p^2)ยฑzโˆ—p^1(1โˆ’p^1)n1+p^2(1โˆ’p^2)n2(\hat{p}_1 - \hat{p}_2) \pm z^* \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}โ€”note that we use each sample's p^\hat{p} separately (no pooling for confidence intervals)
  • Interpretation: if the interval contains zero, there's no statistically significant difference; the direction of the difference tells you which group has the larger proportion
  • Success-failure condition: check all four valuesโ€”n1p^1n_1\hat{p}_1, n1(1โˆ’p^1)n_1(1-\hat{p}_1), n2p^2n_2\hat{p}_2, n2(1โˆ’p^2)n_2(1-\hat{p}_2)โ€”each must be โ‰ฅ10\geq 10

Compare: Two-sample means vs. two-sample proportionsโ€”both compare independent groups, but means use the t-distribution while proportions use z. If an FRQ asks you to compare two treatments with a binary outcome, you need the two-proportion z-interval.

Paired T-Interval for Mean Difference

  • Formula: dห‰ยฑtโˆ—(sdn)\bar{d} \pm t^* \left(\frac{s_d}{\sqrt{n}}\right)โ€”where dห‰\bar{d} is the mean of the differences and sds_d is the standard deviation of the differences
  • When to use: matched pairs designs, before-and-after studies, or any situation where each observation in one sample is linked to a specific observation in the other
  • Key insight: you're reducing a two-sample problem to a one-sample problem by analyzing the differences; df=nโˆ’1df = n - 1 where nn is the number of pairs

Compare: Two-sample t-interval vs. paired t-intervalโ€”the paired approach controls for individual variability and often produces narrower intervals. Watch for FRQ setups where subjects are measured twice or matched by characteristics; that's your cue to use paired procedures.


Inference for Regression Slopes

Unit 9 extends confidence intervals to linear regression. Here you're estimating the true population slope ฮฒ\beta based on your sample slope bb.

T-Interval for the Slope of a Regression Line

  • Formula: bยฑtโˆ—โ‹…SEbb \pm t^* \cdot SE_bโ€”where SEb=sโˆ‘(xiโˆ’xห‰)2SE_b = \frac{s}{\sqrt{\sum(x_i - \bar{x})^2}} and ss is the residual standard deviation; your calculator or computer output provides SEbSE_b directly
  • Degrees of freedom: df=nโˆ’2df = n - 2โ€”you lose two degrees of freedom because you're estimating both the slope and intercept
  • Conditions (LINE): Linear relationship (check residual plot), Independence (random sample, 10% condition), Normal residuals (check histogram or Normal probability plot), Equal variance (residuals show constant spread across x-values)

Compare: T-interval for slope vs. t-interval for meanโ€”both use the t-distribution, but slope inference has df=nโˆ’2df = n - 2 instead of df=nโˆ’1df = n - 1, and the conditions focus on residual behavior rather than the original data. If you see regression output on an FRQ, look for the standard error of the slope coefficient.


Advanced Intervals (Beyond Core AP Content)

These formulas occasionally appear in enrichment contexts but are not central to the AP Statistics exam. Know they exist, but prioritize the intervals above.

Confidence Interval for Population Variance

  • Formula: ((nโˆ’1)s2ฯ‡ฮฑ/22,(nโˆ’1)s2ฯ‡1โˆ’ฮฑ/22)\left(\frac{(n-1)s^2}{\chi^2_{\alpha/2}}, \frac{(n-1)s^2}{\chi^2_{1-\alpha/2}}\right)โ€”uses the chi-squared distribution, which is right-skewed
  • Not symmetric: unlike z and t intervals, this interval is asymmetric around the point estimate
  • Strong normality assumption: the population must be normally distributed; this procedure is sensitive to departures from normality

Confidence Interval for Ratio of Two Variances

  • Formula: (s12s22โ‹…1Fฮฑ/2,s12s22โ‹…Fฮฑ/2)\left(\frac{s_1^2}{s_2^2} \cdot \frac{1}{F_{\alpha/2}}, \frac{s_1^2}{s_2^2} \cdot F_{\alpha/2}\right)โ€”uses the F-distribution with df1=n1โˆ’1df_1 = n_1 - 1 and df2=n2โˆ’1df_2 = n_2 - 1
  • Application: testing whether two populations have equal variances before running a pooled two-sample t-test
  • Requires normality in both populations; rarely tested on AP Statistics but useful for understanding ANOVA assumptions

Quick Reference Table

ConceptBest Examples
Estimating a single proportionOne-sample z-interval for pp
Estimating a single meanT-interval for ฮผ\mu (use z only if ฯƒ\sigma is known)
Comparing two independent proportionsTwo-sample z-interval for p1โˆ’p2p_1 - p_2
Comparing two independent meansTwo-sample t-interval for ฮผ1โˆ’ฮผ2\mu_1 - \mu_2
Comparing paired/matched dataPaired t-interval for ฮผd\mu_d
Estimating a regression slopeT-interval for ฮฒ\beta with df=nโˆ’2df = n - 2
Intervals using z-distributionOne-proportion, two-proportion (large samples)
Intervals using t-distributionOne-mean, two-means, paired, regression slope

Self-Check Questions

  1. What conditions must you verify before constructing a one-sample z-interval for a proportion, and why does each condition matter?

  2. Compare the t-interval for a single mean and the paired t-interval: what do they have in common, and when would you choose one over the other?

  3. If a confidence interval for p1โˆ’p2p_1 - p_2 is (0.03,0.15)(0.03, 0.15), what can you conclude about the relationship between the two population proportions?

  4. An FRQ gives you regression output including b=2.4b = 2.4 and SEb=0.6SE_b = 0.6 with n=22n = 22. What critical value would you use for a 95% confidence interval, and what are the degrees of freedom?

  5. A researcher wants to determine whether a new teaching method improves test scores. Students are tested before and after the intervention. Which confidence interval procedure is appropriate, and why would using a two-sample t-interval be incorrect here?