Two-tailed test

A two-tailed test is a significance test whose alternative hypothesis is "not equal to" (Hₐ: μ ≠ μ₀), so it detects evidence that the parameter differs from the null value in either direction, with the p-value computed using both tails of the sampling distribution.

Verified for the 2027 AP Statistics examLast updated June 2026

What is Two-tailed test?

A two-tailed test is a significance test where the alternative hypothesis says "different from," not "greater than" or "less than." For a population mean, that means Hₐ: μ ≠ μ₀. You're not betting on a direction. You just want to know whether the true mean has moved away from the hypothesized value at all, up or down.

The "two tails" part shows up when you compute the p-value. You still calculate the test statistic the same way, t = (x̄ - μ₀)/(s/√n) with n - 1 degrees of freedom (AP Stats 7.5.A). But because extreme results in either direction count as evidence against H₀, the p-value is the probability in both tails of the t-distribution. In practice, that means doubling the one-tail probability. Same data, same t-score, but a two-tailed p-value is twice the one-tailed p-value. That doubling is exactly why the choice of alternative hypothesis matters so much.

Why Two-tailed test matters in AP Statistics

This term lives in Topic 7.5 (Carrying Out a Test for a Population Mean) in Unit 7, and it touches all three learning objectives there. You need the t-statistic (AP Stats 7.5.A), a correct p-value interpretation that accounts for "as extreme or more extreme in either direction" (AP Stats 7.5.B), and a decision comparing p to α (AP Stats 7.5.C). The bigger payoff is that the one-tailed vs. two-tailed choice isn't a Topic 7.5 quirk. It applies to every significance test you run in Units 6 through 9, from proportions to slopes. Pick the wrong tail structure and your p-value, your decision, and your conclusion can all flip. The choice is made by the research question, before you see the data, never after.

How Two-tailed test connects across the course

One-tailed test (Unit 7)

The one-tailed test is the directional sibling. If the research question asks whether the mean increased or decreased, you use one tail; if it asks whether the mean changed, you use two. With the same t-statistic, the two-tailed p-value is double the one-tailed p-value, which can be the difference between rejecting and failing to reject H₀.

Alternative Hypothesis (Units 6-7)

The alternative hypothesis is what makes a test two-tailed in the first place. The inequality sign in Hₐ (≠ versus > or <) is decided by the research question, and it tells you which region of the t-distribution counts as "evidence against H₀."

P-value (Units 6-7)

In a two-tailed test, "as extreme or more extreme" means extreme in both directions. So a p-value of 0.042 in a two-tailed test means there's a 4.2% chance of getting a sample mean at least this far from μ₀ on either side, assuming H₀ is true (DAT-3.E.1).

Significance Level (Units 6-7)

The decision rule is the same regardless of tails. If p ≤ α, reject H₀; if p > α, fail to reject (DAT-3.F.1). But because two-tailed tests produce larger p-values for the same data, a result that's significant one-tailed at α = 0.05 may not be significant two-tailed.

Is Two-tailed test on the AP Statistics exam?

Multiple-choice questions love to hand you hypotheses like H₀: μ = 50 and Hₐ: μ ≠ 50 with a t-score and p-value, then ask you to spot a flawed conclusion. Two classic traps show up in practice questions on this exact setup. First, a p-value above α (like p = 0.09 with α = 0.05) means you fail to reject H₀, never that you've "proven the mean equals 50." Second, p = 0.042 < 0.05 does not mean you're "95% confident the mean differs from 15"; confidence language belongs to intervals, not tests. On FRQs, a full significance test asks you to state hypotheses (where the ≠ sign signals two-tailed), check conditions, compute t and the p-value, and write a conclusion in context. You can also be asked to read the research question and decide whether a one-tailed or two-tailed test fits. "Has the average changed?" means two-tailed; "has it increased?" means one-tailed.

Two-tailed test vs One-tailed test

Both use the same t-statistic; the difference is the alternative hypothesis and where the p-value comes from. A one-tailed test (Hₐ: μ > μ₀ or μ < μ₀) only counts extreme results in one direction, while a two-tailed test (Hₐ: μ ≠ μ₀) counts both, so its p-value is twice as large for the same data. The research question decides which one you use. "Did the new method change scores?" is two-tailed; "did it raise scores?" is one-tailed. You can't peek at the data and then pick the tail that gives a smaller p-value.

Key things to remember about Two-tailed test

  • A two-tailed test uses the alternative hypothesis Hₐ: μ ≠ μ₀, meaning you're looking for a difference in either direction, not just an increase or a decrease.

  • For the same data, the two-tailed p-value is double the one-tailed p-value, because extreme results on both sides of the distribution count against H₀.

  • The research question, not the data, determines whether the test is one-tailed or two-tailed; words like "changed" or "different" signal two-tailed.

  • The decision rule is unchanged: reject H₀ if p ≤ α, fail to reject if p > α, and never say you "proved" or "accepted" the null hypothesis.

  • The test statistic is the same either way, t = (x̄ - μ₀)/(s/√n) with n - 1 degrees of freedom; only the p-value calculation differs.

  • A correct p-value interpretation assumes H₀ is true and describes the chance of a result at least as extreme in either direction.

Frequently asked questions about Two-tailed test

What is a two-tailed test in AP Stats?

It's a significance test where the alternative hypothesis is "not equal to" (like Hₐ: μ ≠ 50), so evidence in either direction counts against the null. The p-value comes from both tails of the t-distribution, which usually means doubling the one-tail area.

How do I know whether to use a one-tailed or two-tailed test?

Read the research question. "Is the mean different/has it changed?" calls for a two-tailed test (≠); "is the mean greater/less than?" calls for a one-tailed test (> or <). You decide before looking at the data, based on what the question actually asks.

Does a two-tailed test give a bigger p-value than a one-tailed test?

Yes, for the same data the two-tailed p-value is exactly twice the one-tailed p-value, since you're adding the area in both tails. That can flip your decision, like a one-tailed p of 0.045 becoming a two-tailed p of 0.09, which is no longer significant at α = 0.05.

If a two-tailed test gives p = 0.09 with α = 0.05, have we proven the mean equals the null value?

No. Failing to reject H₀ never proves H₀ is true; it just means the data didn't provide convincing evidence against it. The correct conclusion is "we fail to reject H₀," never "we proved μ = 50."

Does p < 0.05 in a two-tailed test mean we're 95% confident the mean is different?

No, that mixes up significance tests with confidence intervals. The right statement is that since p ≤ α, we reject H₀ and have convincing evidence the mean differs from the null value. "95% confident" language belongs to confidence intervals, not p-values.