upgrade
upgrade

🎲Intro to Statistics

Key Statistical Inference Techniques

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Statistical inference is the bridge between what you observe in a sample and what you can conclude about an entire population—and that bridge is exactly what introductory statistics courses test you on. Whether you're estimating a population mean, comparing two groups, or determining if variables are related, you're using inference techniques that share common logic: sampling distributions, standard errors, test statistics, and probability. Mastering these connections will help you tackle both multiple-choice questions and free-response problems with confidence.

Here's the key insight: these techniques aren't isolated tools to memorize separately. They're variations on the same fundamental question—"Could this result have happened by chance?" Don't just memorize formulas and definitions. For each technique, know when to use it, what assumptions it requires, and how to interpret results. That conceptual understanding is what separates students who struggle from those who excel.


Estimation: Quantifying What We Don't Know

These techniques focus on estimating population parameters from sample data. The core principle is that samples vary, so our estimates carry uncertainty—and good statisticians quantify that uncertainty.

Point Estimation

  • Single-value estimates—the sample mean (xˉ\bar{x}), sample proportion (p^\hat{p}), and sample variance (s2s^2) serve as our best guesses for population parameters
  • No uncertainty communicated—while point estimates are useful starting points, they don't tell you how close you might be to the true value
  • Foundation for other methods—point estimates become the center of confidence intervals and the basis for test statistics in hypothesis testing

Confidence Intervals

  • Range of plausible values—a 95% CI means if you repeated sampling many times, about 95% of intervals constructed this way would capture the true parameter
  • Width reflects precision—wider intervals indicate more uncertainty, typically caused by smaller samples or greater variability in the data
  • Margin of error—calculated as critical value×standard error\text{critical value} \times \text{standard error}, this determines how far the interval extends from your point estimate

Compare: Point Estimation vs. Confidence Intervals—both estimate population parameters, but point estimates give a single value while confidence intervals communicate uncertainty. If an FRQ asks you to "estimate and interpret," you'll almost always need a confidence interval, not just a point estimate.


Hypothesis Testing: Making Decisions with Data

Hypothesis testing provides a structured framework for deciding whether sample evidence is strong enough to reject a claim about a population. The logic flows from assuming the null hypothesis is true, then asking how surprising your data would be under that assumption.

Hypothesis Testing Framework

  • Null and alternative hypothesesH0H_0 represents "no effect" or "no difference," while HaH_a represents the claim you're trying to find evidence for
  • P-values measure surprise—the probability of observing results as extreme as yours if H0H_0 were true; small p-values (typically < 0.05) suggest evidence against H0H_0
  • Type I and Type II errors—rejecting a true null (false positive) vs. failing to reject a false null (false negative); the significance level α\alpha controls Type I error rate

z-Tests

  • Large samples or known variance—use when n>30n > 30 or when population standard deviation σ\sigma is known, which is rare in practice
  • Standard normal distribution—test statistic follows z=xˉμ0σ/nz = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}, compared against the standard normal curve
  • Proportions testing—one-sample z-tests for proportions are common: z=p^p0p0(1p0)/nz = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}}

t-Tests

  • Unknown population variance—the realistic scenario where you estimate σ\sigma with sample standard deviation ss, adding uncertainty that the t-distribution accounts for
  • Three main typesone-sample (compare sample to hypothesized mean), independent samples (compare two separate groups), paired samples (compare matched observations)
  • Degrees of freedom matter—with smaller samples, the t-distribution has heavier tails than the z-distribution; as nn increases, t approaches z

Compare: z-Tests vs. t-Tests—both test hypotheses about means, but z-tests require known population variance (rare) while t-tests estimate it from data (common). On exams, if you're given σ\sigma, use z; if you're given ss or raw data, use t. When in doubt, t-test is almost always the safer choice.


Comparing Groups: Testing for Differences

When your research question involves comparing outcomes across multiple groups or categories, these techniques help determine whether observed differences reflect real population differences or just sampling variability.

Analysis of Variance (ANOVA)

  • Comparing three or more means—extends the logic of t-tests to multiple groups simultaneously, avoiding the inflated Type I error rate from multiple pairwise t-tests
  • F-statistic logic—compares between-group variance to within-group variance; large F values suggest group means differ more than random chance would predict
  • Assumptions required—independence, normality within groups, and equal variances (homogeneity); violations matter more with unequal group sizes

Chi-Square Tests

  • Categorical data analysis—used when both variables are categorical, not quantitative; tests whether the distribution of one variable depends on the other
  • Observed vs. expected frequencies—test statistic χ2=(OE)2E\chi^2 = \sum \frac{(O - E)^2}{E} measures how far observed counts deviate from what independence would predict
  • Two main applicationstest of independence (are two categorical variables related?) and goodness-of-fit (does observed distribution match a hypothesized one?)

Compare: ANOVA vs. Chi-Square—both compare groups, but ANOVA tests differences in means of a quantitative response across categorical groups, while chi-square tests associations between categorical variables. Know your variable types: quantitative response → ANOVA; categorical response → chi-square.


Modeling Relationships: Prediction and Explanation

These techniques go beyond testing for differences to model how variables relate to each other, enabling both prediction of outcomes and understanding of underlying relationships.

Regression Analysis

  • Modeling dependence—quantifies how a response variable (yy) changes as predictor variables (xx) change; simple linear regression uses y^=b0+b1x\hat{y} = b_0 + b_1 x
  • Inference on slope—testing whether β1=0\beta_1 = 0 determines if there's a significant linear relationship; confidence intervals for slope indicate precision of the estimated effect
  • Multiple regression extends this—includes several predictors to control for confounding variables and improve predictions; interpretation requires "holding other variables constant" language

Compare: ANOVA vs. Regression—surprisingly similar mathematically! ANOVA is essentially regression with categorical predictors coded as indicator variables. The key difference is framing: ANOVA emphasizes group mean comparisons, while regression emphasizes the equation and prediction. FRQs may ask you to recognize when either approach applies.


Advanced Approaches: Beyond the Basics

These techniques represent more sophisticated approaches to inference that build on foundational concepts. While less commonly tested in intro courses, understanding their logic deepens your grasp of statistical thinking.

Maximum Likelihood Estimation

  • Optimization approach—finds parameter values that make the observed data most probable; asks "what parameter values would have been most likely to generate this sample?"
  • Asymptotic properties—estimates become unbiased and achieve minimum variance as sample size grows, making MLE the preferred method for large samples
  • Underlies many methods—logistic regression, many regression techniques, and advanced models all use MLE; understanding it connects seemingly different procedures

Bayesian Inference

  • Incorporates prior knowledge—combines what you believed before seeing data (prior) with the evidence from data (likelihood) to produce updated beliefs (posterior)
  • Bayes' theorem foundationP(θdata)P(dataθ)×P(θ)P(\theta | \text{data}) \propto P(\text{data} | \theta) \times P(\theta); posterior probability is proportional to likelihood times prior
  • Different interpretation of probability—treats parameters as having probability distributions rather than fixed unknown values; confidence intervals become "credible intervals" with direct probability interpretations

Compare: Maximum Likelihood vs. Bayesian Inference—both estimate parameters, but MLE asks "what parameter maximizes the probability of this data?" while Bayesian asks "what's the probability distribution of the parameter given this data?" MLE is frequentist (parameters are fixed), Bayesian treats parameters as random variables.


Quick Reference Table

ConceptBest Examples
Estimating parametersPoint Estimation, Confidence Intervals, Maximum Likelihood
Testing one meanz-Test (known σ\sigma), t-Test (unknown σ\sigma)
Comparing two meansIndependent t-Test, Paired t-Test
Comparing 3+ meansANOVA
Categorical associationsChi-Square Test of Independence, Chi-Square Goodness-of-Fit
Modeling relationshipsSimple Regression, Multiple Regression
Quantifying uncertaintyConfidence Intervals, Bayesian Credible Intervals
Decision-making frameworkHypothesis Testing, p-values, Significance Level

Self-Check Questions

  1. You want to determine if average test scores differ across four teaching methods. Which technique should you use, and why would multiple t-tests be problematic?

  2. Compare confidence intervals and hypothesis testing: How are they related, and what does it mean when a 95% CI for a mean difference doesn't include zero?

  3. A researcher has categorical data on political party affiliation and opinion on a policy issue. Which technique tests whether these variables are associated, and what does the test statistic measure?

  4. When would you choose a t-test over a z-test for comparing a sample mean to a hypothesized value? What assumption about the population makes this distinction necessary?

  5. Explain how regression analysis and ANOVA are conceptually similar. If you had a quantitative response and a single categorical predictor with three levels, could you use either approach?