Fiveable

📊Honors Statistics Unit 9 Review

QR code for Honors Statistics practice questions

9.3 Distribution Needed for Hypothesis Testing

9.3 Distribution Needed for Hypothesis Testing

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
📊Honors Statistics
Unit & Topic Study Guides
Pep mascot

Choosing the Appropriate Distribution for Hypothesis Testing

Pep mascot
more resources to help you study

Distribution selection for hypothesis tests

Picking the correct distribution is the first real decision you make in a hypothesis test. The choice depends on three things: what parameter you're testing (mean vs. proportion), whether you know the population standard deviation, and your sample size.

For population means:

  • Use the t-distribution when the population standard deviation (σ\sigma) is unknown and the sample size is small (n<30n < 30). You estimate σ\sigma with the sample standard deviation ss, and the t-distribution accounts for the extra uncertainty that introduces.
  • Use the z-distribution (standard normal) when σ\sigma is known, or when n30n \geq 30. With large samples, the Central Limit Theorem kicks in and the t-distribution closely approximates the z-distribution anyway.

For population proportions:

  • Use the z-distribution, but only when the sample is large enough for the normal approximation to hold. The conditions are:

np10andn(1p)10n \cdot p \geq 10 \quad \text{and} \quad n \cdot (1 - p) \geq 10

where nn is the sample size and pp is the hypothesized population proportion. If either product is less than 10, the normal approximation isn't reliable.

Distribution selection for hypothesis tests, Hypothesis Test for a Population Mean (5 of 5) | Concepts in Statistics

Assumptions for statistical tests

Every hypothesis test rests on assumptions. If those assumptions are violated, your results may not be valid.

t-test assumptions:

  • The sample is randomly selected
  • The population is approximately normally distributed, or n30n \geq 30 so the Central Limit Theorem applies
  • The data are continuous and measured on an interval or ratio scale (e.g., temperature, weight)
  • The population standard deviation is unknown (you use ss instead of σ\sigma)

z-test assumptions:

  • The sample is randomly selected
  • The population standard deviation σ\sigma is known
  • The data are continuous and measured on an interval or ratio scale (e.g., IQ scores, annual income)

Proportion test assumptions:

  • The sample is randomly selected
  • Observations are independent, meaning the outcome of one does not influence another (e.g., flipping a coin multiple times, or sampling less than 10% of the population)
  • The data are categorical with exactly two outcomes (pass/fail, defective/non-defective)
  • The sample size conditions are met: np10n \cdot p \geq 10 and n(1p)10n \cdot (1 - p) \geq 10
Distribution selection for hypothesis tests, Hypothesis Testing (4 of 5) | Concepts in Statistics

Sample size impact on testing

The Central Limit Theorem (CLT) is the reason sample size matters so much. It states that as nn increases, the sampling distribution of the sample mean approaches a normal distribution regardless of the shape of the population distribution. In practice, n30n \geq 30 is the standard threshold for the CLT to provide a good approximation.

This has a direct consequence: for large samples, you can use the z-distribution even when σ\sigma is unknown, because ss becomes a reliable estimate of σ\sigma and the t-distribution converges toward the z-distribution.

Beyond distribution choice, sample size affects your results in two other ways:

  • Statistical power increases with larger samples. Power is the probability of correctly rejecting a false null hypothesis, so a larger nn makes it easier to detect a real effect.
  • Small samples (like pilot studies or studies of rare events) may not provide enough evidence to draw valid conclusions, and they're more sensitive to violations of normality assumptions.

Components of hypothesis testing

These four terms come up in every hypothesis test, so make sure you can define each one clearly:

  • Null hypothesis (H0H_0): The default assumption about a population parameter. It typically states "no effect" or "no difference." This is what you test against.
  • Alternative hypothesis (HaH_a): The claim you're investigating. It contradicts H0H_0 and can be one-sided (<< or >>) or two-sided (\neq).
  • Test statistic: A value calculated from your sample data that measures how far your sample result is from what H0H_0 predicts. For means, this is a t-score or z-score; for proportions, it's a z-score.
  • Sampling distribution: The probability distribution of a statistic (like the sample mean xˉ\bar{x}) across all possible samples of a given size. This is what you compare your test statistic against to find a p-value.