Most hypothesis tests you've seen so far compare two separate, unrelated groups. But what happens when the two measurements come from the same subjects or naturally linked pairs? That's where matched (or paired) samples come in. Instead of comparing two group means directly, you calculate the difference within each pair and then test whether those differences are significantly different from zero.

Characteristics of Matched Samples

Matched pairs involve two related samples where each observation in one sample has a direct counterpart in the other. Common examples include:

Before-and-after measurements on the same person (weight before and after a diet program)
Twin studies comparing a trait between identical twins raised in different environments
Matched case-control studies where each case is paired with a similar control based on age, sex, or other factors

Because the observations are linked, the samples are dependent. This is the defining feature that separates matched pairs from independent samples.

The goal is to determine whether the mean difference between paired observations is significantly different from zero.

Null hypothesis ( $H_0$ ): The population mean difference equals zero ( $\mu_d = 0$ ). In other words, there's no real change or difference between the paired conditions.
Alternative hypothesis ( $H_a$ $H_{a}$ ):
- Two-sided: $\mu_d \neq 0$
- One-sided: $\mu_d > 0$ or $\mu_d < 0$

Characteristics of matched samples, Hypothesis Testing: Two Samples | Boundless Statistics

Test Statistic for Matched Pairs

The first step is to compute a difference score for every pair. If you measured 15 people before and after a treatment, you'd get 15 difference values. Then you treat those differences as a single sample and run a one-sample t-test on them.

The formula for the test statistic is:

$t = \frac{\bar{d} - \mu_d}{s_d / \sqrt{n}}$

$\bar{d}$ = mean of the paired differences
$\mu_d$ = hypothesized mean difference (almost always 0 under $H_0$ )
$s_d$ = standard deviation of the differences
$n$ = number of pairs (not the total number of individual observations)

This test statistic follows a t-distribution with $n - 1$ degrees of freedom.

Steps to Conduct a Paired t-Test

Calculate the difference $d_i$ for each pair (e.g., "after" minus "before").
Find the mean of those differences ( $\bar{d}$ ) and their standard deviation ( $s_d$ ).
Plug into the formula above to get your t-statistic.
Determine your critical value from the t-table using $n - 1$ degrees of freedom, or calculate the p-value.
Decision rule: If the p-value is less than your significance level (commonly 0.05), reject $H_0$ . If the p-value is greater, fail to reject $H_0$ .

This procedure is also called a dependent t-test or paired t-test.

Characteristics of matched samples, 8.1 Inference for Two Dependent Samples (Matched Pairs) – Significant Statistics

Matched Pairs vs. Independent Samples

Choosing the wrong test here is a common mistake on exams. The distinction comes down to whether the data points are linked.

Feature	Matched Pairs	Independent Samples
Relationship between samples	Dependent (paired observations)	Unrelated (no pairing)
What you analyze	Differences within each pair	Means of two separate groups
Degrees of freedom	$n - 1$ (where $n$ = number of pairs)	Based on both sample sizes and variances
Typical scenario	Same subjects measured twice, twins, matched controls	Comparing two distinct populations

When to use which: If there's a natural pairing or the same subjects appear in both samples, use matched pairs. If the two groups have no connection to each other, use independent samples.

Why Matched Pairs Can Be More Powerful

A matched pairs design controls for variability between subjects. Since each person serves as their own control, individual differences (like baseline fitness, age, or genetics) get canceled out when you take the difference. This reduces noise in your data, which makes it easier to detect a real effect if one exists. That's why paired designs often have greater statistical power than independent samples designs for the same number of subjects.

The correlation between paired measurements reflects how tightly linked the two sets of scores are. Higher correlation means more of that between-subject variability gets removed, which further increases the precision of your test.