What is an Error?
No matter how well we design our test, perform our calculations and follow our correct procedures, we are still prone to error in our tests. This doesn't necessarily mean we did something wrong in our sampling or our calculations, but just that our calculations gave us an incorrect result. We always have that small random chance of achieving a rare sample that leads us to incorrect results and there are ways that we can minimize this effect.

In inferential statistics, there are two types of errors that can occur in a hypothesis test: type I error and type II error.
Type I Error
A type I error occurs when we reject the null hypothesis when it is actually true. This error is also known as a "false positive." This is due to a low p-value that lead us to make a decision, but actually we drew an extremely rare sample from our population. The probability of a type I error is represented by the significance level α, which is the probability that we will reject the null hypothesis when it is true. In general, we set α to be a small value, such as 0.01 or 0.05, in order to minimize the probability of a type I error.
Type II Error
A type II error occurs when we fail to reject the null hypothesis when it is actually false. This error is also known as a "false negative." This is due to the fact that we did not get a low enough p-value to reject our Ho, but in reality, our Ho is not the truth and there should have been convincing evidence for Ha. The probability of a type II error is represented by β, which is the probability that we will fail to reject the null hypothesis when it is false.
Again, 0.05 is a good significance level that minimizes the probability of making this type of error, while also being sure that our calculations obtained are still statistically significant. The probability of making a Type 2 Error is 𝞫. This is easy to remember because the probabilities of a Type 1/2 Error are alpha/beta respectively.
The probability of a Type II error decreases when any of the following occurs, provided the others do not change:
- Sample size(s) increases.
- Significance level (α) of a test increases.
- Standard error decreases.
- True parameter value is farther from the null.
Power
There are several ways that we can minimize the probability of both type I and type II errors. One way is to use a larger sample size, as this can increase the power of the test, which is the probability of correctly rejecting the null hypothesis when it is false. We can also choose a more stringent significance level, such as α = 0.01, in order to decrease the probability of a type I error. However, this also increases the probability of a type II error (1 - power).
A significance level of 0.05 is usually a good middle ground that minimizes type II error, but also keeps us from making other errors in our study.
The complement of 𝞫 is known as the power of our test. The power is basically a way of saying how strong our test is because it is the probability of NOT making a Type 2 error. We can increase our power by increasing our sample size. Remember, the larger our sample, the closer our estimate is the population parameter, so the less likely we are to make a mistake.
Source: VWOTest Pointers
Common AP Test questions regarding types of errors and power typically ask the following questions:
Identify Error
The first thing AP is likely to ask is how to identify either a Type 1 or 2 error. This is basically writing out the definitions above in context of the given problem. Learn the definitions using the trick above and this part is easy.
Consequence of Error
Past AP Statistics tests also loved asking about the consequence(s) of an error. If we rejected a null hypothesis when we shouldn't have, what are the consequences (in context) of making such an error?
Increase Power
The last thing that AP likes to ask about regarding errors and power is how we can increase power. The answer is always to increase sample size.
Example
image courtesy of: pixabay.comIn a recent study, a researcher was testing the claim that 85% of people are satisfied with their current personal reading goals and achievements. The researcher has reason to believe that this proportion is lower and that people are actually not happy with their personal reading plans and need a new library to borrow from. Therefore, the researcher tests the following hypotheses to see if opening a new public library would help people reach their personal reading goals:
Ho: p = 0.85
Ha: p < 0.85
a) Describe a Type 2 error in context of the problem. Also, list a consequence of making this type of error.
If the researcher makes a Type 2 error in this problem, he/she has failed to reject the Ho, when in fact it should be rejected. This means that the researcher concluded that we did not have evidence that the true population proportion was less than 0.85, when in fact, there is convincing evidence that it is less than 0.85. A consequence of this error is that people will likely remain largely unhappy with their reading achievement when a new library may help them reach their reading goals.
b) What can the researcher do to increase the power of this test?
The researcher can increase the power of this test and therefore decrease the probability of making a Type 2 error by increasing the sample size in the study.
🎥 Watch: AP Stats - Inference: Errors and Powers of Test
Vocabulary
The following words are mentioned explicitly in the College Board Course and Exam Description for this topic.
| Term | Definition |
|---|---|
| null hypothesis | The initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference. |
| parameter | A numerical summary that describes a characteristic of an entire population. |
| power of a test | The probability that a statistical test will correctly reject a false null hypothesis. |
| sample size | The number of observations or data points collected in a sample, denoted as n. |
| significance level | The threshold probability (α) used to determine whether to reject the null hypothesis in a significance test. |
| standard error | The standard deviation of a sampling distribution, which measures the variability of a sample statistic across repeated samples. |
| Type I error | An error that occurs when a null hypothesis is rejected when it is actually true; the probability of committing this error is equal to the significance level (α). |
| Type II error | An error that occurs when a null hypothesis is not rejected when it is actually false. |
Frequently Asked Questions
What's the difference between Type I and Type II errors?
Type I vs. Type II error—short version: a Type I error (false positive) happens when the null hypothesis is actually true but you reject it. Its probability is the significance level α (e.g., α = 0.05 means a 5% chance of a Type I error). A Type II error (false negative) happens when the null is false but you fail to reject it; its probability is β, and power = 1 − β (the chance your test correctly rejects a false null). Things that lower β (so increase power): larger sample size, larger α, smaller standard error, or the true parameter being farther from the null. Which error matters more depends on context—choose α based on how bad a false positive would be (CED UNC-5). For a quick review see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and more Unit 6 resources (https://library.fiveable.me/ap-statistics/unit-6). For practice, try problems at (https://library.fiveable.me/practice/ap-statistics).
How do I remember which error is which - Type I or Type II?
Quick rule: Type I = false positive (reject H0 when H0 is true). Type II = false negative (fail to reject H0 when H0 is false). Mnemonics that help: - “I” in Type I → I reject the null (but it was true). Think “false alarm” or “innocent convicted.” - Type II → II looks like two eyes closed → you missed something (failed to detect a real effect). Link to AP CED facts: α is the probability of a Type I error (your chosen significance level). Power = 1 − β, so β is the probability of a Type II error. Remember how to change them: bigger sample size or larger α lowers β (so increases power). On the exam, always compare p-value to α (if p < α you reject—risk of Type I is α). For a quick AP review, check the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and try practice questions at (https://library.fiveable.me/practice/ap-statistics).
When do I use alpha vs beta in hypothesis testing?
Use α when you're deciding how willing you are to make a Type I error (reject H0 when it’s true). α is the significance level (common choices: 0.05 or 0.01). On the AP exam you compare the p-value to α: if p ≤ α, reject H0; if p > α, fail to reject H0. This ties to UNC-5.B.1 and UNC-5.D.2 in the CED. Use β when you’re thinking about the chance of a Type II error (failing to reject a false H0). β isn’t something you “compare” to a p-value during the test—it’s used when planning studies and computing power (power = 1 − β). β depends on sample size, true effect size (distance from null), standard error, and α (UNC-5.C.1). Practical rule: pick α based on how costly a false positive would be; then, when planning sample size, evaluate β (or power) to ensure the test can detect the effect you care about. For more AP-aligned review, see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and Unit 6 overview (https://library.fiveable.me/ap-statistics/unit-6). For practice, try problems at (https://library.fiveable.me/practice/ap-statistics).
What's the formula for calculating the probability of a Type II error?
Type II error probability is usually called β and equals 1 − power. In words: β = P(fail to reject H0 | the alternative (true) parameter is p1). For a one-sample z-test for a proportion (H0: p = p0, Ha: p > p0) you can compute β with the normal approximation. Let zα be the critical z for α and n the sample size. The critical sample proportion is phat_c = p0 + zα·√[p0(1 − p0)/n]. Then β = P(phat ≤ phat_c | p = p1) = Φ( (phat_c − p1) / √[p1(1 − p1)/n] ) so explicitly β = Φ( [p0 + zα·√(p0(1−p0)/n) − p1] / √(p1(1−p1)/n) ). (For left- or two-sided tests the critical values change; same idea applies.) Remember on the AP you can also use the shortcut β = 1 − power and that β decreases when n increases, α increases, standard error decreases, or the true parameter moves farther from the null (CED UNC-5). For a focused review see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and try practice problems (https://library.fiveable.me/practice/ap-statistics).
I'm confused about power - is it 1 minus Type II error probability?
Yes—power is 1 minus the Type II error probability. In AP terms: if β is the probability of a Type II error (failing to reject a false H0), then power = 1 − β, which is the probability the test correctly rejects a false null (UNC-5.B.2 and UNC-5.B.3). Remember α (the significance level) is the probability of a Type I error (rejecting a true H0, UNC-5.B.1). Also keep the trade-offs in mind: increasing sample size, increasing α, decreasing standard error, or having a true parameter farther from the null all raise power (i–iv in UNC-5.C.1). For an AP review, see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and try practice problems (https://library.fiveable.me/practice/ap-statistics) to see power and β calculations in context.
How do I know if a Type I or Type II error is worse in a real situation?
Which error is “worse” depends on the real-world consequences—there’s no one-size answer. By definition: Type I (α) = false positive (reject H0 when it’s true); Type II (β) = false negative (fail to reject H0 when it’s false). Ask: which mistake causes bigger harm or cost? - Examples: In medicine, a Type I (claim a drug works when it doesn’t) can expose patients to harm—so you’d keep α small. In airport security, a Type II (miss a real threat) is worse, so you’d tolerate a larger α to reduce β. - Use the CED idea: α is the probability of a Type I error, so choose α based on consequences (UNC-5.B.1, UNC-5.D.2). Remember the trade-off: decreasing α usually increases β, but you can reduce β by increasing sample size, reducing standard error, or increasing α (UNC-5.C.1). For AP exam prep, practice deciding which error is worse in context and justify your choice (see the Topic 6.7 study guide for examples) (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS). For more review or practice, check the Unit 6 overview (https://library.fiveable.me/ap-statistics/unit-6) and 1000+ practice problems (https://library.fiveable.me/practice/ap-statistics).
What does it mean when they say "reject a true null hypothesis"?
“Reject a true null hypothesis” means you concluded from your sample that there is an effect or difference when in fact the null hypothesis (no effect) is true. That mistake is a Type I error (a false positive). The CED calls this UNC-5.A.1—and the chance you make that mistake (assuming the null really is true) is the significance level α that you set before testing (common choices: 0.05 or 0.01). So if α = 0.05, you’ll wrongly reject a true H0 about 5% of the time in the long run. Whether that error matters depends on context (UNC-5.D.1)—e.g., false positives are worse in some settings (like approving a harmful drug). Remember the trade-off: lowering α reduces Type I errors but can raise Type II errors (β) and lower power (1 − β). For a quick review, see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and try practice problems (https://library.fiveable.me/practice/ap-statistics).
Can someone explain false positive vs false negative in hypothesis testing?
Think of hypothesis testing like a medical screen: - False positive = Type I error. The null hypothesis is actually true, but you reject it. (CED: UNC-5.A.1) The probability of this happening is the significance level α—common choices: 0.05, 0.01. Example: You conclude a new drug works when it really doesn’t. - False negative = Type II error. The null is false, but you fail to reject it. (CED: UNC-5.A.2) Its probability is β, and the test’s power = 1 − β (CED: UNC-5.B.2, UNC-5.B.3). Example: You miss a real effect and say the drug doesn’t work. Key trade-offs (CED: UNC-5.C): raising α lowers β (more chance to detect real effects but more false alarms). β decreases when sample size increases, standard error decreases, or the true effect is farther from the null. Which error matters more depends on context (CED: UNC-5.D.1–.2). For an AP review, see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and practice problems (https://library.fiveable.me/practice/ap-statistics).
How does increasing sample size affect Type I and Type II error rates?
Increasing the sample size doesn't change the probability of a Type I error (α)—that’s set by your significance level (e.g., α = 0.05). But increasing n reduces the standard error (SE ∝ 1/√n), which makes your test statistic more likely to fall in the rejection region when the null is false. So larger n increases power and therefore decreases the probability of a Type II error (β = 1 − power). In short: α is controlled by your choice of significance level; increasing n lowers SE, increases power, and reduces β (you’re more likely to detect a true effect). Remember other factors affect β too: larger effect size (true parameter farther from H0) and larger α also reduce β (UNC-5.C in the CED). For a quick recap and AP-style examples, check the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and practice problems (https://library.fiveable.me/practice/ap-statistics).
What's the relationship between significance level and Type I error probability?
The significance level α is exactly the probability of making a Type I error—rejecting H0 when it’s actually true. So if you set α = 0.05, you’re accepting a 5% chance of a false positive (UNC-5.B.1). Increasing α (say from 0.01 to 0.10) raises that Type I error probability but also reduces the probability of a Type II error (increases power), all else equal (UNC-5.C.1). That trade-off means you pick α based on consequences: if a false positive is serious, choose a smaller α; if missing a real effect is worse, you might tolerate a larger α (UNC-5.D.2). For AP problems you should be able to identify α as P(Type I), explain how α and β (1 − power) move together, and note sample size, effect size, and SE also affect errors. For a quick review see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and more Unit 6 resources (https://library.fiveable.me/ap-statistics/unit-6). For extra practice, try problems at (https://library.fiveable.me/practice/ap-statistics).
Why does decreasing alpha increase the chance of Type II error?
Alpha (α) sets how easy it is to reject H0—it’s the probability of a Type I error (false positive). If you decrease α (say from 0.05 to 0.01), you make the rejection region smaller: your critical value moves further into the tail, so sample results must be more extreme to reject H0. That reduces the test’s sensitivity to real effects, so when the null is actually false you’re more likely to fail to reject it—that’s a larger probability of a Type II error (β). Remember power = 1 − β, so lowering α (all else equal) lowers power and raises β. The CED lists this trade-off and other ways to reduce β: increase sample size, decrease standard error, or have a true parameter farther from the null (UNC-5.C). For an AP review, see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and try practice problems (https://library.fiveable.me/practice/ap-statistics) to see the effect numerically.
I don't understand how to interpret power in context of a problem.
Power is the probability your test correctly rejects a false null hypothesis (CED: UNC-5.B.2). In context, say H0: p = 0.50 and the true p = 0.60; the power is the chance your sample produces a test statistic in the rejection region so you’ll conclude p ≠ 0.50 when it really is 0.60. Power = 1 − β, where β is the Type II error (CED: UNC-5.B.3). What changes power? (CED: UNC-5.C) Increasing sample size or α, decreasing standard error, or having a true value farther from the null all increase power. For example, with α = 0.05 and the same effect size (true p − null p = 0.10), a larger n raises the chance you’ll detect that 0.10 difference. When you write interpretations on the AP exam, state the parameter, the hypothesized value, the actual value under consideration, and say power is “the probability the test will reject H0 when the true parameter equals ….” For more practice and explanations, see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and the Unit 6 overview (https://library.fiveable.me/ap-statistics/unit-6). For lots of practice problems, try (https://library.fiveable.me/practice/ap-statistics).
How do I calculate power if I know the Type II error probability?
Power = 1 − β. Here β (Type II error probability) is the chance you fail to reject H0 when the alternative is true; power is the probability you correctly reject a false H0. So if you’re given β = 0.20, power = 1 − 0.20 = 0.80 (80%). Remember what affects β and therefore power (UNC-5.C): increasing sample size, increasing α, decreasing standard error, or having the true parameter farther from the null all decrease β and increase power. On the AP exam you might be asked to compute or compare power from a given β or describe how changing sample size or α changes β and power (Topic 6.7). For a quick refresher see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and try practice problems (https://library.fiveable.me/practice/ap-statistics).
What factors make Type II errors more likely to happen?
Type II errors (β) happen when the null is false but you fail to reject it—i.e., low test power. They’re more likely when any of the following are true (these are the opposites of UNC-5.C.1): - Small sample size—smaller n → larger standard error → lower power. - Low significance level (α)—tightening α (e.g., from 0.05 to 0.01) reduces Type I risk but raises β. - Large standard error / high variability—noisy data makes it hard to detect a real effect. - True parameter is very close to the null value (small effect size)—tiny differences are harder to detect. - Poor study design (nonrandom sampling, measurement error) that inflates variability or biases estimates. Remember β = 1 − power, so anything that reduces power increases Type II risk. To reduce β: increase n, reduce variability, or accept a larger α if context allows. For AP review see the Topic 6.7 study guide (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/study-guide/YxhrmoLje3YYOwcocJrS) and practice problems (https://library.fiveable.me/practice/ap-statistics).
When the true parameter is farther from the null value, what happens to error probabilities?
If the true parameter is farther from the null value (i.e., larger effect size), the test is more likely to detect that difference. Concretely: the probability of a Type II error (β) goes down and the power of the test (1 − β) goes up. The Type I error rate (α) is set by your significance level and doesn’t change just because the true parameter is farther away. Other factors that also reduce β (holding others fixed) are larger sample size and smaller standard error (CED UNC-5.C.1). This is why bigger effects + bigger n give you higher power. For more AP-aligned detail, see the Topic 6.7 study guide (Fiveable) here: (https://library.fiveable.me/ap-statistics/unit-6/potential-errors-when-performing-tests/YxhrmoLje3YYOwcocJrS) and try practice problems at (https://library.fiveable.me/practice/ap-statistics).

