To carry out a two-sample t-test for the difference of two population means, calculate the t statistic by dividing the difference in sample means by the standard error, find the degrees of freedom with technology, get the p-value, and compare it to your significance level. If the p-value is at or below alpha, reject the null hypothesis and state your conclusion in context.
Why This Matters for the AP Statistics Exam
This topic is the calculation and conclusion stage of comparing two means, which shows up often in free-response questions that ask whether data give convincing evidence of a difference. When a question asks for convincing evidence, it is asking for a significance test, not just a description of the numbers. You need to identify the correct parameter and hypotheses, check conditions, calculate the test statistic and p-value, and then write a conclusion that links the p-value to the decision in context. Being precise with notation and showing your work clearly is important for strong exam responses on both multiple-choice and free-response questions.

Key Takeaways
- The test statistic is
t = ((x̄₁-x̄₂)-(μ₁-μ₂))/√(s₁²/n₁+s₂²/n₂), and under the null the(μ₁-μ₂)term is 0. - The standard error of the difference is
√(s₁²/n₁+s₂²/n₂). - Degrees of freedom fall between the smaller of n1-1 and n2-1 and n1+n2-2; technology gives a precise value.
- The p-value is computed by assuming the null is true, meaning the two population means are equal.
- Compare the p-value to alpha: if p ≤ alpha, reject H0; if p > alpha, fail to reject H0.
- Write down the t statistic, degrees of freedom, and p-value, and state your conclusion in context.
Calculating the Test Statistic
Once you have confirmed the conditions for a two-sample t-test are met, you can calculate the test statistic and p-value to decide whether the difference between the two means is statistically significant.
You are comparing one quantitative variable across two independent samples. The first step is finding the difference between the two sample means and dividing it by the standard error of the difference.
Degrees of Freedom
- Calculating by hand: take the smaller of the two sample sizes and subtract 1. This is the conservative approach, similar to what you did with a single sample in Unit 7.5.
- Using technology such as a graphing calculator: the degrees of freedom come with the output, and the value falls between the smaller of n1-1 and n2-1 and n1+n2-2.
t Statistic
The general test statistic formula is:
For the difference of two population means, this becomes:
The full form includes the hypothesized difference (μ₁-μ₂) in the numerator, but since the null hypothesis usually sets that difference to 0, the term drops out. You can build this from the general test statistic formula and the standard error formulas on the Formula Sheet, so you do not need to memorize it.
Calculating the P-Value
With your degrees of freedom and t statistic, you can use the table on the Formula Sheet. Find the row for your degrees of freedom, look across to find the t value closest to yours, and use the tail probability that matches.
A more exact approach is to run a two-sample t-test using technology such as a graphing calculator. You can either type in the summary statistics or enter the raw data into a list. The output gives you the t statistic, degrees of freedom, and p-value.
Remember that the p-value is computed by assuming the null hypothesis is true, which means assuming the two population means are equal.
On a free-response question, write down the t statistic, the degrees of freedom, and the p-value so your work is complete and clear.
Making the Decision and Stating a Conclusion
Once you have your p-value, compare it to the significance level (often 0.05) to evaluate the null hypothesis.
- If the p-value is at or below alpha, reject H0. You have convincing evidence for the alternative hypothesis.
- If the p-value is greater than alpha, fail to reject H0. You do not have convincing evidence for the alternative.
Always state your conclusion in context. For a study comparing the mean number of green beans picked from two fields, a conclusion might read:
Since the p-value is essentially 0 and less than 0.05, we reject H0. We have convincing evidence that the true mean number of green beans picked from Field A differs from the true mean picked from Field B.
That conclusion works because it compares the p-value to the significance level, states the decision about H0, connects to the alternative hypothesis, and stays in the context of the problem.
Worked Example: Comparing Recovery Times
Here is how the pieces fit together in a real comparison. In a study comparing mean recovery times for two surgical procedures to repair a torn ACL, one group had a sample size of 110 and the other had 100. The degrees of freedom fall between 100 (the smaller of 110 and 100) and 208 (110 + 100 − 2). Using technology, the degrees of freedom came out to about 207.18. With a test statistic of t ≈ 7.13, the p-value is the area greater than 7.13 for a t-distribution with df = 207.18. That very large t statistic gives a tiny p-value, which would lead you to reject the null and conclude there is convincing evidence of a difference in mean recovery times.
How to Use This on the AP Statistics Exam
Free Response
- State hypotheses in terms of population parameters:
H₀: μ₁-μ₂=0(orμ₁=μ₂) and an alternative that matches the question. - Name the procedure (two-sample t-test for a difference of means) and check conditions before calculating.
- Show the test statistic, degrees of freedom, and p-value.
- Compare the p-value to alpha with a numerical reference, such as "Because p < 0.05, we reject H0."
- End with a conclusion in context that connects back to the alternative hypothesis.
MCQ
- Be ready to identify the correct standard error,
√(s₁²/n₁+s₂²/n₂), and the correct test statistic. - Know that under the null, the
(μ₁-μ₂)term equals 0. - Recognize that degrees of freedom from technology fall between the smaller of n1-1 and n2-1 and n1+n2-2.
Common Trap
- "Convincing evidence" signals a significance test, not just a description of the data.
- Rejecting H0 is not the same as proving the alternative; it means the data are unlikely under the null.
Common Misconceptions
- The two-sample t-test uses two independent samples. Do not confuse it with a matched-pairs setup, where you analyze differences as a single sample.
- The p-value is not the probability that the null hypothesis is true. It is computed assuming the null is true.
- Failing to reject H0 does not prove the means are equal. It only means you lacked convincing evidence of a difference.
- The hypotheses must be written with population parameters (μ₁ and μ₂), not sample statistics.
- A formal decision compares the p-value to alpha directly. Vague statements like "the p-value is small" without a comparison are not enough.
- Degrees of freedom for two samples are not simply n1-1. By hand you use the smaller sample size minus 1 as a conservative value, while technology gives a more exact number.
Related AP Statistics Guides
- Unit 7 Overview: Means
- 7.2 Constructing a Confidence Interval for a Population Mean
- 7.1 Introducing Statistics: Should I Worry About Error?
- 7.4 Setting Up a Test for a Population Mean
- 7.3 Justifying a Claim About a Population Mean Based on a Confidence Interval
- 7.6 Confidence Intervals for the Difference of Two Means
Vocabulary
The following words are mentioned explicitly in the College Board Course and Exam Description for this topic.Term | Definition |
|---|---|
degrees of freedom | A parameter of the t-distribution that affects its shape; as degrees of freedom increase, the t-distribution approaches the normal distribution. |
difference in sample means | The result of subtracting one sample mean from another sample mean, calculated as x̄₁ - x̄₂. |
difference of population means | The difference between the mean values of two distinct populations, calculated as μ₁ - μ₂. |
normal distribution | A probability distribution that is mound-shaped and symmetric, characterized by a population mean (μ) and population standard deviation (σ). |
null hypothesis | The initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference. |
p-value | The probability of observing a test statistic as extreme as or more extreme than the one calculated from the sample data, assuming the null hypothesis is true. |
population means | The average values of two distinct populations being compared, denoted as μ₁ and μ₂. |
quantitative variable | A variable that is measured numerically and can take on a range of values, allowing for mathematical operations and statistical analysis. |
randomized experiment | A study design where subjects are randomly assigned to treatment groups to establish cause-and-effect relationships. |
reject the null hypothesis | The decision made when the p-value is less than or equal to the significance level, indicating sufficient evidence against the null hypothesis. |
sampling distribution | The probability distribution of a sample statistic (such as a sample proportion) obtained from repeated sampling of a population. |
significance level | The threshold probability (α) used to determine whether to reject the null hypothesis in a significance test. |
significance test | A statistical procedure used to determine whether there is sufficient evidence to reject the null hypothesis based on sample data. |
simple random sample | A sample selected from a population such that every possible sample of the same size has an equal chance of being chosen. |
standard error | The standard deviation of a sampling distribution, which measures the variability of a sample statistic across repeated samples. |
statistical reasoning | The logical process of using sample data and significance test results to draw conclusions about populations and answer research questions. |
t-distribution | A probability distribution used when the population standard deviation is unknown and the sample standard deviation is used instead, characterized by heavier tails than the normal distribution. |
test statistic | A calculated value used to determine whether to reject the null hypothesis in a hypothesis test, computed from sample data. |
two-sample test | A significance test used to compare the means of two different populations based on sample data from each population. |
Frequently Asked Questions
What test is used for the difference of two population means?
Use a two-sample t-test when comparing the means of two populations using independent random samples or a randomized experiment and quantitative data.
What is the two-sample t-test statistic?
The test statistic compares the difference in sample means to the hypothesized difference, usually zero, divided by the standard error based on the two sample standard deviations and sample sizes.
Do I need to memorize the two-sample t-test formula for AP Statistics?
No. The AP Statistics CED notes that test statistic formulas do not need to be memorized because they can be built from the general test statistic structure and formula sheet information.
How do I find degrees of freedom for a two-sample t-test?
Use technology for degrees of freedom when available. The degrees of freedom fall between the smaller of n1 - 1 and n2 - 1 and the value n1 + n2 - 2.
How do I interpret the p-value for a two-sample t-test?
The p-value is computed assuming the null hypothesis is true, usually that the two population means are equal. It gives the probability of getting a test statistic as extreme as the observed one by random chance.
How do I write the conclusion for a two-sample t-test?
Compare the p-value to alpha, reject or fail to reject the null hypothesis, and state the result in context of the two populations and the research question.