Confidence intervals let you estimate the true average home cost in a population when you can't measure every single home. Instead of a single number, you get a range of plausible values based on your sample data. This section covers how to build and interpret those intervals, what makes them wider or narrower, and when to use the t-distribution.

Confidence Intervals for Home Costs

Confidence intervals for mean home cost

A confidence interval gives you a range of values likely to contain the true population mean home cost. You build it from three ingredients: your sample mean, the variability in your data, and a critical value tied to your chosen confidence level.

The general formula is:

$\bar{x} \pm \text{critical value} \times \frac{s}{\sqrt{n}}$

$\bar{x}$ is the sample mean, your point estimate of the population mean
$s$ is the sample standard deviation
$n$ is the sample size
$\frac{s}{\sqrt{n}}$ is the standard error, which measures how much your sample mean is likely to vary from sample to sample
The entire piece after the $\pm$ is the margin of error

Interpreting the result: If you calculate a 95% confidence interval of $200,000 to $250,000, that means you're 95% confident the true average home cost falls somewhere in that range. More precisely, if you repeated the sampling process many times, about 95% of the intervals you'd construct would capture the true population mean. Any single interval either contains the true mean or it doesn't; the confidence level describes the long-run success rate of the method.

Confidence intervals for mean home cost, Estimating a Population Mean (1 of 3) | Statistics for the Social Sciences

Effects on confidence interval width

Two main factors control how wide your interval is: sample size and confidence level.

Sample size:

Larger samples produce narrower intervals. As $n$ increases, the standard error $\frac{s}{\sqrt{n}}$ shrinks, which reduces the margin of error.
Smaller samples produce wider intervals because there's more uncertainty in your estimate.

Confidence level:

Higher confidence levels (like 99%) produce wider intervals because the critical value gets larger, which increases the margin of error.
Lower confidence levels (like 90%) produce narrower intervals, but you're less certain you've captured the true mean.

The tradeoff: You can't get a narrow interval and high confidence without doing more work. To narrow your interval while keeping the same confidence level, you need to increase your sample size. If you can't collect more data, the only way to get a narrower interval is to accept a lower confidence level.

Confidence intervals for mean home cost, 8.1 A Single Population Mean using the Normal Distribution – Elementary Statistical Methods

T-distribution in confidence intervals

When the population standard deviation is unknown (which is almost always the case in practice) and your sample size is small (typically less than 30), you use the t-distribution instead of the standard normal (z) distribution. The t-distribution has heavier tails, meaning it assigns more probability to extreme values. This accounts for the extra uncertainty that comes from estimating the population standard deviation using sample data.

Here's how to construct a confidence interval using the t-distribution:

Calculate the sample mean $\bar{x}$ and sample standard deviation $s$ from your data.
Find the degrees of freedom: $df = n - 1$ .
Look up the critical t-value for your desired confidence level and degrees of freedom using a t-table or calculator.
Calculate the margin of error: $t_{\text{critical}} \times \frac{s}{\sqrt{n}}$ .
Build the interval: $\bar{x} \pm \text{margin of error}$ .

The interpretation is the same as with a z-interval. As your sample size grows, the t-distribution looks more and more like the standard normal distribution, so for large samples the two approaches give nearly identical results.

Confidence intervals are one form of statistical inference, which means drawing conclusions about a population based on sample data. Hypothesis testing is another form; instead of estimating a range, it tests a specific claim about a population parameter.

Both methods rely on the idea of a sampling distribution, which describes how a sample statistic (like the sample mean) varies across all possible samples of a given size. The Central Limit Theorem is what makes this work: for sufficiently large samples, the sampling distribution of the mean is approximately normal, regardless of the shape of the original population. This is why normal and t-based confidence intervals are valid even when the underlying home cost data isn't perfectly bell-shaped.