Fiveable

📊Honors Statistics Unit 8 Review

QR code for Honors Statistics practice questions

8.2 A Single Population Mean Using the Student's t-Distribution

8.2 A Single Population Mean Using the Student's t-Distribution

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
📊Honors Statistics
Unit & Topic Study Guides
Pep mascot

Confidence Intervals and the Student's t-Distribution

When the population standard deviation is unknown, you can't use a z-score to build a confidence interval. Instead, you use the Student's t-distribution, which accounts for the extra uncertainty that comes from estimating the standard deviation from your sample. The t-distribution is wider and has heavier tails than the normal distribution, giving you appropriately wider intervals when you have less information.

Pep mascot
more resources to help you study

Confidence Intervals Using the t-Distribution

The t-distribution is used to construct confidence intervals when the population standard deviation (σ\sigma) is unknown and must be estimated from the sample using ss. While textbooks often cite n<30n < 30 as the threshold for using tt, in practice you should use the t-distribution any time σ\sigma is unknown, regardless of sample size. For large nn, the t- and z-intervals will be nearly identical anyway.

Steps to construct a confidence interval:

  1. Calculate the sample mean (xˉ\bar{x}) and sample standard deviation (ss) from your data.

  2. Determine the degrees of freedom: df=n1df = n - 1.

  3. Look up the critical t-value (tt^*) for your desired confidence level (e.g., 95%) and your dfdf, using a t-table or calculator.

  4. Compute the margin of error: E=tsnE = t^* \cdot \frac{s}{\sqrt{n}}

  5. Build the interval: xˉ±E\bar{x} \pm E

The result is an interval (xˉE,  xˉ+E)(\bar{x} - E,\; \bar{x} + E) that you are, say, 95% confident contains the true population mean μ\mu.

What affects the width of the interval?

  • Sample size (nn): Larger samples shrink the margin of error because n\sqrt{n} is in the denominator. More data means a more precise estimate.
  • Confidence level: A higher confidence level (e.g., 99% vs. 90%) requires a larger tt^*, which widens the interval. You're trading precision for greater confidence.
  • Sample variability (ss): More spread in your data increases the margin of error.

Example: Suppose you sample 20 students and find xˉ=74.5\bar{x} = 74.5 and s=8.2s = 8.2. For a 95% confidence interval, df=19df = 19 and t2.093t^* \approx 2.093. The margin of error is 2.0938.2203.842.093 \cdot \frac{8.2}{\sqrt{20}} \approx 3.84. Your interval is (70.66,  78.34)(70.66,\; 78.34).

Confidence intervals using t-distribution, Estimating a Population Mean (3 of 3) | Concepts in Statistics

T-Scores vs. Z-Scores

Both t-scores and z-scores measure how far a sample mean is from a hypothesized population mean, in units of standard error. The key difference is which standard deviation you're using.

  • Z-score: Used when σ\sigma is known. The formula is z=xˉμσ/nz = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}}.
  • T-score: Used when σ\sigma is unknown and you substitute the sample standard deviation ss. The formula is:

t=xˉμs/nt = \frac{\bar{x} - \mu}{s / \sqrt{n}}

Where:

  • xˉ\bar{x} = sample mean
  • μ\mu = hypothesized population mean
  • ss = sample standard deviation
  • nn = sample size

Because ss is itself a random variable (it changes from sample to sample), the t-score has more variability than a z-score. That extra variability is exactly what the heavier tails of the t-distribution capture.

As nn grows large, ss becomes a very good estimate of σ\sigma, and the t-distribution converges to the standard normal (z) distribution.

Confidence intervals using t-distribution, Margin of error - Wikipedia

Properties of the Student's t-Distribution

The t-distribution shares some features with the standard normal distribution but differs in important ways, especially at small sample sizes.

  • Symmetric and bell-shaped, centered at zero, just like the standard normal.
  • Heavier tails than the standard normal. This means extreme values are more likely under the t-distribution, which reflects the added uncertainty from estimating σ\sigma with ss.
  • Defined by degrees of freedom (df=n1df = n - 1). The dfdf is the single parameter that controls the shape. Lower dfdf means heavier tails; higher dfdf means the distribution looks more and more like the standard normal.
  • Standard deviation is greater than 1 for small dfdf, and approaches 1 as dfdf increases.
  • For df>30df > 30 or so, the t-distribution is nearly indistinguishable from the standard normal. This is why older rules of thumb suggest switching to z at n=30n = 30, but there's no harm in always using tt when σ\sigma is unknown.

Why heavier tails matter: Heavier tails produce larger critical values (tt^*), which widen your confidence interval. This is the t-distribution's way of saying, "You have less information, so your interval should be wider to compensate."

Type I and Type II Errors

These concepts connect confidence intervals to the broader framework of hypothesis testing.

  • Type I error (false positive): Rejecting the null hypothesis when it is actually true. The probability of this is α\alpha, your significance level. If you construct a 95% confidence interval, α=0.05\alpha = 0.05, meaning there's a 5% chance the interval fails to contain the true mean.
  • Type II error (false negative): Failing to reject the null hypothesis when it is actually false. The probability of this is β\beta.

There's a tradeoff: making α\alpha smaller (to reduce false positives) increases β\beta (more false negatives), assuming sample size stays the same.

Statistical Power and Effect Size

Statistical power is the probability of correctly rejecting a false null hypothesis. It equals 1β1 - \beta. Higher power means you're more likely to detect a real effect when one exists.

Three main factors influence power:

  • Sample size: Larger nn reduces the standard error (s/ns / \sqrt{n}), making it easier to detect differences. This is the factor researchers have the most control over.
  • Effect size: The magnitude of the true difference between the hypothesized value and the actual population parameter. Larger effects are easier to detect. For example, detecting a 10-point difference in mean test scores is much easier than detecting a 1-point difference.
  • Significance level (α\alpha): A more stringent threshold (e.g., α=0.01\alpha = 0.01 instead of 0.050.05) requires stronger evidence to reject the null, which decreases power.

Effect size also helps you assess practical significance. A result can be statistically significant (small p-value) but have a tiny effect size, meaning the difference may not matter in the real world. Conversely, a meaningful effect might not reach statistical significance if the sample is too small. Both statistical and practical significance should be considered when interpreting results.