Confidence intervals help estimate population means using sample data. They provide a range of likely values for the true population average, accounting for sampling variability and uncertainty.
For large samples or known population standard deviations, we use z-distributions. With small samples or unknown standard deviations, t-distributions are applied. Understanding sample size requirements and interpreting results are crucial for accurate business insights.
Confidence Intervals for Population Means
Confidence intervals with z-distribution
- Used when sample size is large () or population is normally distributed and population standard deviation () is known
- Confidence interval formula:
- represents sample mean
- represents critical value from standard normal distribution
- represents significance level and represents confidence level (95%, 99%)
- represents sample size
- Example: Estimating average customer spending (\sigma = \20n = 100\bar{x} = $$50$$, 95% confidence level)
Confidence intervals with t-distribution
- Used when sample size is small (), population is normally distributed, and population standard deviation is unknown
- Sample standard deviation () used as estimate for population standard deviation
- Confidence interval formula:
- represents critical value from t-distribution with degrees of freedom
- t-distribution has heavier tails compared to standard normal distribution accounting for additional uncertainty when using sample standard deviation
- Example: Estimating average employee satisfaction score (, , , 90% confidence level)
Sample size for confidence intervals
- Margin of error () represents maximum acceptable difference between sample mean and population mean
- Sample size formula for known population standard deviation:
- Sample size formula for unknown population standard deviation:
- Iterative process or software often used to solve for since sample size appears on both sides of equation
- Example: Determining sample size for estimating average customer wait time ( minutes, 95% confidence level, minutes)
Interpretation of confidence intervals
- Provides range of plausible values for population mean
- Confidence level (95%, 99%) represents long-run probability that interval will contain true population mean
- Business applications:
- Estimating average sales, revenue, or customer satisfaction score for product or service
- Comparing mean performance of different business units or strategies
- Determining if process change has resulted in significant improvement in key metric
- Consider width of confidence interval and practical significance of results when making decisions based on interval estimate
- Example: Comparing average sales between two store locations (\bar{x}_1 = \1000\bar{x}_2 = $$1200$$50$$350$$)
Additional Considerations
Assumptions and limitations
- Assumes sample is randomly selected from population
- Assumes population is normally distributed or sample size is large enough for Central Limit Theorem to apply
- Violations of assumptions may lead to inaccurate confidence intervals
- Provides estimate of population mean but does not prove causality between variables
- Example: Non-random sampling may result in biased estimate of population mean