Fiveable

📊AP Statistics Unit 6 Review

QR code for AP Statistics practice questions

6.5 Interpreting p-Values

📊AP Statistics
Unit 6 Review

6.5 Interpreting p-Values

Written by the Fiveable Content Team • Last updated September 2025
Verified for the 2026 exam
Verified for the 2026 examWritten by the Fiveable Content Team • Last updated September 2025
Pep mascot
image courtesy of:  imgflip.com
Pep mascot
more resources to help you study

What is a p-value?

The p-value of a significance test is a measure of the probability of obtaining a sample with a test statistic that is at least as extreme as the one observed, under the assumption that the null hypothesis is true. In other words, it's the proportion of possible samples of a given size that are equal to or less than/greater than our given sample.

It is used to help determine whether the observed results are statistically significant or not. 

  • If the p-value is small, it suggests that the observed sample is unlikely to have occurred by chance, and therefore provides evidence against the null hypothesis.
  • On the other hand, if the p-value is large, it suggests that the observed sample is not significantly different from what we would expect to see by chance alone, and therefore does not provide strong evidence against the null hypothesis.

Here's how the College Board defines p-values:

The p-value is the "proportion of values for the null distribution that are as extreme or more extreme than the observed value of the test statistic." This is

  1. The proportion at or above the observed value of the test statistic, if the alternative is >.
  2. The proportion at or below the observed value of the test statistic, if the alternative is <.
  3. The proportion less than or equal to the negative of the absolute value of the test statistic plus the proportion greater than or equal to the absolute value of the test statistic, if the alternative is ≠.
Source: Simply Psychology

How do we interpret a p-value?

If our p-value is low, this means that it is highly unlikely that our sample would be chosen randomly. This could be due to one of three things:

  1. Legitimate random chance. (I mean, someone wins the lottery every now and then right?)
  2. Some form of sampling bias. (This is why we check that our sample is random before proceeding to a significance test!)
  3. Our hypothesized value in the null hypothesis is actually false. This is what we are checking with our significance test. To be sure that this didn't occur by random chance, we should maybe check 2 or 3 random samples. If we get the same result each time, the consistency leads us to believe we are up to something (no one wins the lottery three times in a row). To be sure it isn't sampling bias, we make sure we have random samples. If both of those are met, there must be a problem with null hypothesis value.

Again, it's important to remember that the p-value is computed under the assumption that the null hypothesis is true. Therefore, when interpreting the p-value of a significance test, it's important to consider the context in which the test was conducted and the implications of the null hypothesis being true!

For example, in a one-sample proportion test, the null hypothesis typically states that the true population proportion is equal to a particular value (usually 0.5, or no difference from the hypothesized value). If the p-value is small, it suggests that the observed sample proportion is significantly different from the hypothesized value, and provides evidence against the null hypothesis. In this case, you might conclude that the true population proportion is different from the hypothesized value. 

On the other hand, if the p-value is large, it suggests that the observed sample proportion is not significantly different from the hypothesized value, and does not provide strong evidence against the null hypothesis. In this case, you might conclude that there is insufficient evidence to reject the null hypothesis and that the true population proportion is equal to the hypothesized value.

Example

In the recent issue of Sports Unlimited, Jackie reads that a right-handed hockey player scores on approximately 5% of their shots. To test this claim, Jackie watches 15 random hockey games and records 921 shots from random, right-handed hockey players. She finds that they scored on 60 of those shots. After calculating her z-score and p-value, she finds that her p-value is essentially 0.017. Interpret this p value.

This p-value means that of all possible samples of 921 shots from right handed players, approximately 1.7% of those samples would have at least 60 shots. This sample was random from the given information, so no obvious sampling bias. It could be that Jackie just hit the jackpot and watched the right players to have such a high goal scoring percentage. She could check this by redoing the experiment a few times. 

The other option is that the 5% isn't actually correct. Maybe that hypothesized percentage is a bit higher...

🎥 Watch: AP Stats - Inference: Hypothesis Tests for Proportions

Practice Problem

A political campaign is trying to determine whether the proportion of registered voters in their district who support their candidate is significantly different from the overall national proportion of 50%. They conduct a survey by randomly sampling 1000 registered voters in their district and ask whether they support their candidate. They find that 540 out of the 1000 respondents support their candidate.

a) Write the null and alternative hypotheses for this scenario.

b) After conducting a one-sample z-test to determine whether the proportion of registered voters in the district who support the candidate is significantly different from the national proportion of 50%, you find that the p-value for this one-sample z-test is 0.031. Based on the results of the z-test and the p-value, what can the campaign conclude about the proportion of registered voters in the district who support their candidate? What are the limitations of this conclusion?

Answer

a) Null hypothesis: The proportion of registered voters in the district who support the candidate is equal to the national proportion of 50%.

H0: p = 0.50

Alternative hypothesis: The proportion of registered voters in the district who support the candidate is significantly different from the national proportion of 50%.

Ha: p ≠ 0.50

b) Based on the results of the z-test, the campaign can conclude that the proportion of registered voters in the district who support their candidate is significantly different from the national proportion of 50%, because the p-value of 0.031 is smaller than the commonly used significance level of 0.05. 

This suggests that the proportion of registered voters in the district who support the candidate is higher than the national proportion. However, it's important to note that this conclusion is based on the assumption that the null hypothesis (that the proportion of registered voters in the district who support the candidate is equal to the national proportion of 50%) is true. 

Vocabulary

The following words are mentioned explicitly in the College Board Course and Exam Description for this topic.

TermDefinition
alternative hypothesisThe claim that contradicts the null hypothesis, representing what the researcher is trying to find evidence for.
null distributionThe probability distribution of the test statistic under the assumption that the null hypothesis is true.
null hypothesisThe initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference.
one-sample proportionA confidence interval or hypothesis test that estimates or tests a single population proportion based on data from one sample.
p-valueThe probability of observing a test statistic as extreme as or more extreme than the one calculated from the sample data, assuming the null hypothesis is true.
population proportionThe true proportion or percentage of a characteristic in an entire population, typically denoted as p.
probability modelA mathematical framework that describes the probability distribution of outcomes under specified assumptions.
sample statisticA numerical value calculated from sample data that is used to estimate the corresponding population parameter.
significance levelThe threshold probability (α) used to determine whether to reject the null hypothesis in a significance test.
significance testA statistical procedure used to determine whether there is sufficient evidence to reject the null hypothesis based on sample data.
standard errorThe standard deviation of a sampling distribution, which measures the variability of a sample statistic across repeated samples.
test statisticA calculated value used to determine whether to reject the null hypothesis in a hypothesis test, computed from sample data.
theoretical distributionA probability distribution based on a mathematical model, such as the normal distribution, used to approximate the distribution of a test statistic.
z-statisticA standardized test statistic for a population proportion calculated as (sample statistic - null value) divided by the standard deviation of the statistic.
z-testA hypothesis test that uses the standard normal distribution to determine whether a sample statistic differs significantly from a population parameter.

Frequently Asked Questions

What is a p-value and how do I calculate it for a proportion test?

A p-value is the probability, assuming the null hypothesis and the probability model are true, of getting a test statistic as extreme or more extreme than what you observed (so it measures how surprising the data are under H0). For a one-sample proportion z-test use the standardized statistic z = (p̂ − p0) / sqrt[ p0(1 − p0) / n ] where p̂ is your sample proportion and p0 is the null proportion (CED VAR-6.G.3). Then get the p-value from the standard normal (z) null distribution: - If Ha: p > p0, p-value = P(Z ≥ zobs) - If Ha: p < p0, p-value = P(Z ≤ zobs) - If Ha: p ≠ p0, p-value = P(|Z| ≥ |zobs|) (two-tailed) Check conditions (approx normal: np0 and n(1−p0) large enough or use randomization). Compare p-value to α to decide (p ≤ α → reject H0). More practice and topic guidance on interpreting p-values: Fiveable study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and unit overview (https://library.fiveable.me/ap-statistics/unit-6). For lots of practice problems, see (https://library.fiveable.me/practice/ap-statistics).

How do I know if my p-value means I reject or fail to reject the null hypothesis?

Compare the p-value to your chosen significance level α. If p ≤ α, reject H0 (the result is statistically significant); if p > α, fail to reject H0 (not enough evidence). Remember what the p-value actually is: it’s the probability, assuming the null distribution is true, of getting a test statistic as extreme or more extreme than observed (CED DAT-3.A.1–2). For one-sided alternatives use: - Ha: p > p0 → p-value = proportion at or above the observed test statistic - Ha: p < p0 → p-value = proportion at or below the observed test statistic For two-sided Ha: p ≠ p0 → p-value = tail area ≤ −|z| plus tail area ≥ |z|. Significance level α may be given or chosen by the researcher; common choices are 0.05 or 0.01. For more practice and worked examples see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and more problems at the Unit 6 page (https://library.fiveable.me/ap-statistics/unit-6) or the practice bank (https://library.fiveable.me/practice/ap-statistics).

What's the formula for the z-test statistic when testing population proportions?

For a one-sample z-test for a population proportion, the test statistic is the standardized value z = (p̂ − p0) / sqrt( p0(1 − p0) / n ), where p̂ is your sample proportion, p0 is the null value of the population proportion, and n is the sample size. This follows the general test-statistic form (statistic − parameter) / (SD of statistic) from the CED (VAR-6.G.2–6.G.3). Remember: when you perform the test you compute the p-value from the standard normal null distribution and interpret it as the probability of seeing a test statistic as extreme (or more) assuming H0 is true (DAT-3.A.1–A.2). The AP exam doesn’t give these test-statistic formulas on the formula sheet, but you can build them from the provided standard-error formulas. For review and practice, see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70), the Unit 6 overview (https://library.fiveable.me/ap-statistics/unit-6), and practice problems (https://library.fiveable.me/practice/ap-statistics).

I'm confused about interpreting p-values - does a small p-value mean my null hypothesis is true or false?

Short answer: a small p-value is evidence against the null hypothesis—it doesn’t prove the null is false. Why: by definition the p-value is the probability of getting a test statistic as extreme (or more) as what you observed assuming the null hypothesis and the probability model are true (CED VAR-6.G.1 & DAT-3.A.2). So a small p-value (e.g., 0.01) means your observed result would be very unlikely if H0 were true. That’s why you “reject H0” when p ≤ α: the data are inconsistent with H0 at that significance level. But remember: rejection is probabilistic, not proof. A small p-value suggests H0 is unlikely, not impossible—you can still make a Type I error (false positive). Also, a large p-value doesn’t prove H0 true; it just means the data don’t provide strong evidence against H0. For practice and AP-style wording, see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and try problems at (https://library.fiveable.me/practice/ap-statistics).

When do I use a one-tailed vs two-tailed test and how does that change my p-value calculation?

Use a one-tailed test when your alternative hypothesis is directional (Ha: p > p0 or Ha: p < p0). Use a two-tailed test when you just want to detect any difference (Ha: p ≠ p0). Calculation difference: compute the z-statistic the same way: z = (p̂ − p0) / √[p0(1−p0)/n]. Then get tail probabilities from the null (standard normal) distribution: - For Ha: p > p0, p-value = P(Z ≥ observed z)—the area to the right. - For Ha: p < p0, p-value = P(Z ≤ observed z)—the area to the left. - For Ha: p ≠ p0, p-value = 2 × P(Z ≥ |observed z|)—both tails combined. Example: if z = 2.1, one-tailed p ≈ 1 − Φ(2.1) ≈ 0.018, two-tailed p ≈ 2(0.018) ≈ 0.036. Remember AP expects you to state H0 and Ha, compute z using the null p0 (CED VAR-6.G.3), and interpret the p-value in context (DAT-3.A.2). For a quick review see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and more practice problems (https://library.fiveable.me/practice/ap-statistics).

How do I find the p-value on my calculator after I get the z-statistic?

On a graphing calculator (TI-83/84 family) use the normalcdf function with the standard normal (mean 0, sd 1). Let z be your observed z-statistic. - Left-tailed (Ha: p < p0): p-value = normalcdf(-1E99, z, 0, 1) - Right-tailed (Ha: p > p0): p-value = normalcdf(z, 1E99, 0, 1) - Two-tailed (Ha: p ≠ p0): p-value = 2 * normalcdf(z, 1E99, 0, 1) if z > 0 (or 2 * normalcdf(-1E99, z, 0, 1) if z < 0). Equivalently: p = 2 * (1 - Φ(|z|)). Quick TI tip: normalcdf(a,b,0,1) where a = -1E99 and b = z gives Φ(z) (area left of z). Compare that p-value to α to decide whether to reject H0 (AP requirement: state conclusion in context). For help with examples and AP-style practice, see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and extra practice (https://library.fiveable.me/practice/ap-statistics).

What's the difference between the p-value and the significance level alpha?

The p-value is a probability computed from your null distribution that measures how extreme your observed test statistic is assuming H0 is true (VAR-6.G.4; DAT-3.A.1–2). It answers: “If the null were true, what's the chance of data this extreme?” Alpha (α) is a fixed significance level you choose before testing—a cutoff for deciding when results are too unlikely under H0. Practically: if p ≤ α, reject H0; if p > α, fail to reject H0. α also represents the maximum long-run probability of a Type I error (wrongly rejecting a true null). So p is data-driven and varies by sample; α is chosen by the researcher and does not change with the data. For AP-style questions, always state the p-value interpretation in context and compare p to α when justifying conclusions (see Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70)). For more practice, try problems at (https://library.fiveable.me/practice/ap-statistics).

Can someone explain step by step how to interpret a p-value in context of the problem?

Step-by-step: 1. State H0 and Ha in context (e.g., H0: p = 0.50; Ha: p > 0.50). 2. Compute the test statistic (for proportions use z = (p̂ − p0)/√(p0(1−p0)/n))—remember the CED says you can build this from the general test-statistic formula. 3. Find the p-value: the probability, assuming H0 and the null distribution are true, of getting a test statistic as extreme or more extreme than observed. For Ha: “>” use area at or above z; for “<” use area at or below z; for “≠” double the tail. (CED DAT-3.A.1–A.2, VAR-6.G.4) 4. Compare p-value to α: if p ≤ α, reject H0 (evidence for Ha); if p > α, fail to reject H0 (not enough evidence). 5. Write a conclusion in context: state whether the data provide convincing statistical evidence and tie it to the population parameter. Example: “p = 0.02 < 0.05, so there is convincing evidence that the true proportion is greater than 0.50.” For more practice and phrasing examples see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and unit review (https://library.fiveable.me/ap-statistics/unit-6). For drills, try the practice problems (https://library.fiveable.me/practice/ap-statistics).

I don't understand what "as extreme or more extreme" means when calculating p-values.

Think of "as extreme or more extreme" as looking at how far your observed test statistic lies from what the null hypothesis predicts, using the null distribution (randomization or theoretical z). If your alternative is ">", the p-value is the probability under H0 of getting a test statistic at least as large as the one you saw (the right tail). If the alternative is "<", it's the left tail (at or below). If the alternative is "≠", it's both tails: the probability of getting a value as far or farther from the null in either direction (<= −|z| plus >= |z|). Practically: compute z = (p̂ − p0)/√(p0(1−p0)/n), find where that z sits on the null (z) curve, then sum the tail area(s) beyond it. That tail area is the p-value used to compare to α for your AP-style conclusion (DAT-3.A). More practice: see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and tons of practice problems (https://library.fiveable.me/practice/ap-statistics).

How do I write a proper conclusion using the p-value for a hypothesis test about proportions?

Write a short, specific conclusion in four parts: (1) answer the question in context, (2) compare p-value to α, (3) state the statistical decision (reject/fail to reject H0), and (4) interpret what that decision means about the population proportion, remembering the p-value assumes H0 is true. Templates you can copy: - Two-sided example: "Because p = 0.012 < α = 0.05, we reject H0. There is convincing statistical evidence that the population proportion differs from 0.30." - One-sided (greater) example: "Because p = 0.03 < α = 0.05, we reject H0. The data provide convincing evidence that the population proportion is greater than 0.30." - If p > α: "Because p = 0.18 > α = 0.05, we fail to reject H0. The data do not provide convincing evidence that the population proportion differs from 0.30." Always tie the wording to the context (what p and p̂ refer to) and avoid saying "accept H0." For more practice and wording examples, see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70), the Unit 6 overview (https://library.fiveable.me/ap-statistics/unit-6), and tons of practice problems (https://library.fiveable.me/practice/ap-statistics).

What does it mean that the p-value assumes the null hypothesis is true?

Saying the p-value "assumes the null is true" means you compute the probability of your observed (or more extreme) test statistic under the null distribution—that is, you act as if the population parameter equals the null value (e.g., p = p0) when you build the null/randomization distribution. For a one-sample z-test for a proportion you use z = (p̂ − p0) / sqrt[p0(1−p0)/n] (VAR-6.G.3) and find the tail area(s) of that distribution (VAR-6.G.1, DAT-3.A.1). The p-value is that tail probability; it’s NOT the probability the null hypothesis is true. Instead it’s the probability of the data (or more extreme) if the null model were correct. If p ≤ α you reject H0; if p > α you fail to reject (DAT-3.A.2). For a quick review of interpreting p-values on the AP, see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and try practice problems (https://library.fiveable.me/practice/ap-statistics).

Why do I double my p-value for two-tailed tests but not one-tailed tests?

Because a p-value measures how extreme your test statistic is under H0, what “extreme” means depends on the alternative hypothesis. For a one-sided alternative (Ha: p > p0 or Ha: p < p0) you only care about one tail of the null distribution—the probability at or beyond the observed z in that one direction (CED DAT-3.A.1a–b). For a two-sided alternative (Ha: p ≠ p0) you care about extremes in both directions, so you add the probability in the tail on the observed side and the symmetric tail on the opposite side. That’s why you take the one-tail probability for |z| and double it (CED DAT-3.A.1c). On the AP exam, be explicit: state H0 and Ha, compute z = (p̂ − p0)/√(p0(1−p0)/n) (VAR-6.G.3), find one-tail area from the standard normal, and double it only when Ha is ≠ (see the Topic 6.5 study guide for examples: https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70). For extra practice, try problems at (https://library.fiveable.me/practice/ap-statistics).

I keep getting confused about whether to use normalcdf or invNorm when finding p-values - which one do I use?

Use normalcdf to get probabilities (p-values); use invNorm to find cutoff z* or critical values for a given alpha. Steps for a one-sample z-test for a proportion (CED VAR-6.G.3): 1. Compute z = (p̂ − p0) / sqrt[p0(1−p0)/n]. 2. Get the p-value with normalcdf: - Ha: p > p0 (right-tail) → p-value = normalcdf(z, +∞). - Ha: p < p0 (left-tail) → p-value = normalcdf(−∞, z). - Ha: p ≠ p0 (two-sided) → p-value = 2·normalcdf(|z|, +∞) (or 2·(1−Φ(|z|))). 3. Use invNorm when you need the critical z for a significance level (e.g., z* = invNorm(1−α) for a right-tail test or invNorm(1−α/2) for two-sided). Remember AP rules: calculators are allowed for these (use normalcdf and invNorm), and your p-value interpretation must assume the null distribution is true (CED DAT-3.A.2). For practice, check the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and more problems at (https://library.fiveable.me/practice/ap-statistics).

How do I know if my sample size is large enough to use the z-test for proportions?

Use the z-test for a one-sample proportion only when the null-model sampling distribution is approximately normal. For AP Stats (CED VAR-6), that means check the success–failure counts using the null proportion p0 (not p̂): - np0 ≥ 10 and n(1 − p0) ≥ 10. If both hold, the test statistic z = (p̂ − p0) / sqrt[p0(1−p0)/n] is appropriate (VAR-6.G.3). If either count is < 10, the normal approximation may be poor—use a randomization test or exact binomial methods instead (CED VAR-6.G.1). Remember: for hypothesis tests you always plug p0 into the standard error; for confidence intervals you use p̂. The AP exam expects you to verify these conditions before using a z-test (and you can build the formula from the formula sheet). For a focused review, see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and more unit practice at (https://library.fiveable.me/ap-statistics/unit-6). For extra practice problems, go to (https://library.fiveable.me/practice/ap-statistics).

What's the connection between the test statistic formula and the standard error formula on the formula sheet?

Short answer: the test statistic formula is just the general “standardized” formula applied to the sample proportion using the standard error from the formula sheet. The CED’s rule (VAR-6.G.2) says test statistic = (sample statistic − null value) / (standard deviation or standard error of the statistic). For one-sample proportion tests you plug in the sampling SE that assumes H0 is true: SE = sqrt[p0(1 − p0)/n], so z = (p̂ − p0) / sqrt[p0(1 − p0)/n]. The AP formula sheet gives the general standardized test line and the sampling-distribution SE for proportions; you don’t have to memorize every test formula—you build it from that general form (CED VAR-6.G.2–G.3). For more review and worked examples see the Topic 6.5 study guide (https://library.fiveable.me/ap-statistics/unit-6/interpreting-p-values/study-guide/b0FEXf5MDjyQtz4Skz70) and plenty of practice problems (https://library.fiveable.me/practice/ap-statistics).