Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Hypothesis testing is the backbone of statistical inference—it's how statisticians move from sample data to conclusions about populations. On the AP Statistics exam, you're not just being tested on whether you can plug numbers into formulas. You're being tested on whether you understand the logic of inference: why we assume the null hypothesis is true, how we measure evidence against it, and what our conclusions actually mean in context. Every free-response question involving inference expects you to demonstrate this reasoning process.
The steps of hypothesis testing connect directly to core concepts like probability, sampling distributions, Type I and Type II errors, and the interpretation of p-values. Understanding each step—and why it matters—will help you tackle everything from one-sample z-tests to chi-square tests for independence. Don't just memorize the sequence; know what each step accomplishes and how errors at any stage can invalidate your conclusions.
Before any calculations happen, you need to establish what you're testing and what would count as convincing evidence. This foundation determines everything that follows—get it wrong, and your entire analysis falls apart.
Compare: Null hypothesis vs. Alternative hypothesis—both describe population parameters, but assumes no effect while specifies the effect you're looking for. On FRQs, always define your parameters in context before writing the hypotheses.
The validity of your inference depends entirely on whether the conditions for the test are met. This is where many students lose points—skipping conditions or stating them without verification.
Compare: Conditions for proportions vs. conditions for means—both require randomness and independence, but proportions need the Large Counts condition while means rely on the Central Limit Theorem or population normality. If an FRQ asks you to "state and check conditions," you must do both.
This is where the math happens—but remember, the calculations serve the reasoning. You're quantifying how surprising your sample result would be if were true.
Compare: P-value approach vs. Critical value approach—both lead to the same decision, but p-values tell you how much evidence you have while critical values just give you a yes/no answer. The AP exam strongly favors the p-value approach.
The final steps connect your statistical results back to the real-world question. This is where you demonstrate understanding—not just calculation ability.
Compare: Statistical significance vs. Practical significance—a drug that lowers blood pressure by 0.5 mmHg might be statistically significant with but practically meaningless. FRQs increasingly ask you to address whether results matter in context, not just whether they're "significant."
| Concept | Key Steps/Elements |
|---|---|
| Setting up hypotheses | State and using parameters, define parameters in context |
| Significance level | Choose (usually 0.05), understand as Type I error probability |
| Conditions for inference | Random, Independence (10% rule), Large Counts or Normality |
| Test statistic | z for proportions, t for means, for categorical |
| P-value interpretation | Probability of observed result (or more extreme) if true |
| Decision rule | Reject if p-value ; fail to reject if p-value |
| Type I vs. Type II error | Type I: reject true ; Type II: fail to reject false |
| Conclusion language | "Convincing evidence" for reject; "insufficient evidence" for fail to reject |
What is the difference between the null hypothesis and the alternative hypothesis, and why do we assume is true when calculating the p-value?
A student writes "We accept because the p-value is 0.12." Identify two errors in this statement and explain how to correct them.
Compare and contrast Type I and Type II errors: Which one does control, and how does increasing sample size affect each?
Why must you check conditions before performing a hypothesis test, and what happens to your conclusions if the conditions aren't met?
An FRQ presents a hypothesis test with p-value = 0.001 and asks whether the result is practically significant. What additional information would you need to answer this question, and why might statistical significance not imply practical importance?