A confidence interval gives you a range of plausible values for the true population mean home cost, based on what you observe in a sample. The core idea: since we almost never have data on every home in a population, we use sample statistics plus a margin of error to build an interval that likely captures the true mean.

When the population standard deviation ( $\sigma$ ) is known, use the z-formula:

$\bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}$

$\bar{x}$ : sample mean home cost
$z_{\alpha/2}$ : critical value from the standard normal distribution (1.96 for 95% confidence)
$\sigma$ : known population standard deviation
$n$ : sample size

When the population standard deviation is unknown (which is almost always the case in practice), use the t-formula:

$\bar{x} \pm t_{\alpha/2,\, n-1} \cdot \frac{s}{\sqrt{n}}$

$s$ : sample standard deviation (used as an estimate of $\sigma$ )
$t_{\alpha/2,\, n-1}$ : critical value from the t-distribution with $n - 1$ degrees of freedom

The t-distribution has heavier tails than the z-distribution, which makes the interval slightly wider to account for the extra uncertainty of estimating $\sigma$ with $s$ . As $n$ grows large, the t-distribution approaches the standard normal.

Steps to calculate a confidence interval for home costs:

Choose your confidence level and find the corresponding critical value. For 95% confidence with a large sample, $z_{\alpha/2} = 1.96$ . If using the t-distribution with, say, 99 degrees of freedom, $t_{\alpha/2} \approx 1.984$ .
Calculate the sample mean $\bar{x}$ and (if $\sigma$ is unknown) the sample standard deviation $s$ from your data.
Compute the standard error: $\frac{\sigma}{\sqrt{n}}$ or $\frac{s}{\sqrt{n}}$ .
Plug into the formula. For example, with $\bar{x} = 300{,}000$ , $s = 60{,}000$ , and $n = 100$ : $300{,}000 \pm 1.984 \cdot \frac{60{,}000}{\sqrt{100}} = 300{,}000 \pm 11{,}904$
State the interval and interpret it in context: "We are 95% confident that the true mean home cost in this population is between $\$288{,}096$ and $\$311{,}904$ ."

Note that this procedure assumes the sampling distribution of $\bar{x}$ is approximately normal. That's satisfied when the population itself is roughly normal, or when the sample size is large enough for the Central Limit Theorem to apply (typically $n \geq 30$ ).

Confidence intervals for population means, Estimating a Population Mean (3 of 3) | Concepts in Statistics

Interpretation of confidence intervals

Getting the interpretation right matters a lot on exams. A 95% confidence interval does not mean there's a 95% probability the true mean falls in this particular interval. The true mean is a fixed value; it's either in the interval or it isn't.

The correct interpretation: if you repeated the sampling process many times and built a 95% confidence interval each time, about 95% of those intervals would contain the true population mean.

Width of the interval tells you about precision. A narrow interval like $\$290{,}000$ to $\$310{,}000$ is a much more useful estimate than a wide one like $\$200{,}000$ to $\$400{,}000$ .

Three factors control the width:

Sample size ( $n$ ): Larger samples produce narrower intervals because the standard error $\frac{s}{\sqrt{n}}$ shrinks. Going from $n = 50$ to $n = 500$ cuts the standard error by roughly a factor of $\sqrt{10} \approx 3.16$ .
Variability in the data: More spread in home prices (larger $s$ ) means wider intervals. A market with $s = \$100{,}000$ produces a much wider interval than one with $s = \$20{,}000$ .
Confidence level: Higher confidence requires a larger critical value, which widens the interval. A 99% interval is wider than a 90% interval from the same data. You're trading precision for greater confidence.

Confidence intervals for population means, 8.1 A Single Population Mean using the Normal Distribution | Introduction to Statistics

Applications in housing markets

Comparing two markets: If the confidence intervals for mean home prices in two cities don't overlap at all (City A: $\$200{,}000$ to $\$250{,}000$ ; City B: $\$300{,}000$ to $\$350{,}000$ ), that's strong informal evidence of a real difference in population means. However, overlapping intervals (City A: $\$200{,}000$ to $\$300{,}000$ ; City B: $\$250{,}000$ to $\$350{,}000$ ) do not automatically mean the means are equal. Two intervals can overlap and a formal hypothesis test can still find a significant difference. Comparing individual confidence intervals is a rough tool, not a substitute for a proper two-sample test.

Evaluating individual properties: If a home is listed at $\$400{,}000$ and the 95% confidence interval for the area's mean price is $\$250{,}000$ to $\$350{,}000$ , that listing sits above the estimated average range. This doesn't mean the price is "wrong," but it does suggest the home is priced above what's typical for the area.

Limitations to keep in mind:

A confidence interval is only as good as the sample it's built from. A biased or non-representative sample (e.g., only sampling luxury homes) will produce a misleading interval no matter how large $n$ is.
The method assumes observations are independent and that the normality condition is met (either a normal population or a large enough sample).
Even at 95% confidence, 5% of intervals constructed this way will miss the true mean. That's not a flaw; it's built into the method.

Statistical Inference and Hypothesis Testing

Confidence intervals are one of the two main tools of statistical inference (drawing conclusions about a population from sample data). The other is hypothesis testing.

Both methods rely on the sampling distribution of $\bar{x}$ and its standard error ( $\frac{\sigma}{\sqrt{n}}$ or $\frac{s}{\sqrt{n}}$ ). A confidence interval estimates where the parameter is. A hypothesis test evaluates whether the parameter equals a specific claimed value. They're closely connected: if a hypothesized value falls outside your 95% confidence interval, you'd reject that value at the $\alpha = 0.05$ significance level.