A significance test is a formal inference procedure that assumes a null hypothesis is true, calculates a test statistic and p-value from sample data, and compares that p-value to a significance level α to decide whether the evidence is strong enough to reject the null in favor of the alternative.
A significance test (also called a hypothesis test) is how you answer the question "is this result real, or could it just be random chance?" You start by assuming the boring explanation, the null hypothesis, is true. Then you measure how far your sample result falls from what the null predicts, using a standardized test statistic like z = (p̂ - p₀) / √(p₀(1-p₀)/n) for a proportion or t = (x̄ - μ) / (s/√n) for a mean. From that statistic you get a p-value, the probability of seeing a result at least as extreme as yours if the null really were true.
The decision rule is simple. If the p-value ≤ α, reject H₀ because results that surprising rarely happen by chance alone. If the p-value > α, fail to reject H₀. Notice you never "accept" or "prove" anything. Every test follows the same four-part structure on the AP exam: state hypotheses and parameters, name the test and verify conditions (random sampling, the 10% condition, and an approximately normal sampling distribution), compute the test statistic and p-value, and write a conclusion in context that links the p-value comparison back to the alternative hypothesis.
Significance tests are the backbone of Units 6 and 7. Unit 6 covers tests for one proportion and the difference of two proportions (Topics 6.5, 6.10, 6.11), and Unit 7 covers tests for one mean, matched pairs, and the difference of two means (Topics 7.5, 7.8, 7.10). The learning objectives hit every stage of the process. You calculate test statistics and p-values (AP Stats 6.5.A, 7.5.A, 6.11.A), interpret p-values correctly (AP Stats 6.5.B, 7.5.B, 6.11.B), set up hypotheses and pick the right method (AP Stats 6.10.A, 6.10.B, 7.8.A, 7.8.B), verify conditions (AP Stats 6.10.C, 7.8.C), and justify a conclusion by comparing p to α (AP Stats 6.11.C, 7.5.C). One useful relief from the CED itself: the test statistic formulas are not on the formula sheet, but you don't need to memorize them. You can build each one from the general pattern (statistic minus null value, divided by the standard deviation of the statistic) using the standard error formulas that ARE on the sheet.
Keep studying AP Statistics Unit 9QyrTKQ1AuLyQwom
p-Value (Unit 6)
The p-value is the engine of every significance test. It is calculated assuming the null is true, so a small p-value means your data would be rare in a world where H₀ holds. Misinterpreting it is the single most common point-loser in Topics 6.5, 6.11, and 7.5.
Null Hypothesis and Alternative Hypothesis (Units 6-7)
Every test starts here. The null states no effect or no difference (H₀: p = p₀, H₀: p₁ = p₂, H₀: μ₁ = μ₂), and the alternative is what the researcher suspects. The direction of Hₐ (<, >, or ≠) determines exactly how the p-value is computed.
Confidence Interval (Units 6-7)
Significance tests and confidence intervals are two sides of the same inference coin. A test gives a yes/no decision about a claimed parameter value, while an interval gives a range of plausible values. A two-sided test at α = 0.05 and a 95% confidence interval will agree on whether the null value is plausible.
10% Condition (Units 5-7)
Before running any test, you verify conditions, and independence is one of them. When sampling without replacement, you check that the sample is at most 10% of the population (n ≤ 10%N for each sample). Skipping condition checks costs points on inference FRQs.
Significance testing shows up two ways. MCQs love interpretation traps. Practice questions repeatedly test whether you know that a p-value of 0.002 provides strong evidence against H₀ but does not "prove" the alternative is true, that a p-value of 0.078 at α = 0.05 means you fail to reject (not "accept") the null, and that the p-value is NOT "the chance the null is true." It is the chance of getting data this extreme assuming the null is true. On the FRQ side, a full inference test is nearly guaranteed somewhere in the free-response section. The 2021 FRQ Q2 about walking and cholesterol levels is typical exam framing where a study design feeds into an inference question. To earn full credit you need all four components: hypotheses with defined parameters, named procedure with verified conditions, correct test statistic and p-value, and a conclusion that compares p to α and answers the research question in context. Topic 7.10 (Skills Focus) exists specifically because the exam tests whether you can select, implement, and communicate the whole procedure, not just crunch numbers.
Both are inference procedures built on the same sampling distributions, but they answer different questions. A significance test starts with a specific claim (like p = 0.6) and asks whether the data give convincing evidence against it. A confidence interval starts with no claim and estimates a range of plausible parameter values. Tests give a reject/fail-to-reject decision; intervals give an estimate with a margin of error. On the exam, read the prompt's verb. "Is there convincing evidence that..." means run a test, while "estimate the proportion..." means build an interval.
A significance test assumes the null hypothesis is true, then asks how surprising the sample data would be under that assumption.
The p-value is the probability of getting a result as extreme or more extreme than the observed one, computed assuming H₀ is true. It is not the probability that H₀ is true.
The formal decision rule compares the p-value to α. If p ≤ α, reject H₀; if p > α, fail to reject H₀. You never accept or prove the null.
Use a z-test for proportions (Unit 6) and a t-test for means (Unit 7). For two means, that's a two-sample t-test; for two proportions, a two-sample z-test using the pooled proportion p̂c.
Always verify conditions before testing: random samples or random assignment, the 10% condition when sampling without replacement, and an approximately normal sampling distribution (large counts for proportions, n > 30 or roughly normal data for means).
Test statistic formulas aren't on the formula sheet, but you can rebuild them from the general pattern: (sample statistic minus null value) divided by the standard deviation of the statistic.
It's a formal procedure for deciding whether sample data provide convincing evidence against a null hypothesis. You compute a test statistic and p-value assuming H₀ is true, then reject H₀ if the p-value is at or below the significance level α (commonly 0.05).
No. A p-value like 0.002 gives strong evidence against the null, but significance tests never prove anything. There's always a chance the result happened by random variation alone, which is exactly what a Type I error is.
A significance test evaluates a specific claim about a parameter and ends with a reject/fail-to-reject decision, while a confidence interval estimates a range of plausible values for the parameter. A two-sided test at α = 0.05 and a 95% confidence interval will reach consistent conclusions about the null value.
You fail to reject the null hypothesis. For example, with p = 0.078 and α = 0.05, you say the data do not provide convincing evidence for the alternative. You never say you "accept" H₀ or that H₀ is true.
No. The CED explicitly says these formulas can be constructed from the general form (sample statistic minus null value, divided by the standard deviation of the statistic) plus the standard error formulas already on the AP formula sheet.