Fiveable

🎲Intro to Statistics Unit 9 Review

QR code for Intro to Statistics practice questions

9.3 Probability Distribution Needed for Hypothesis Testing

9.3 Probability Distribution Needed for Hypothesis Testing

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🎲Intro to Statistics
Unit & Topic Study Guides

Probability Distributions for Hypothesis Testing

Probability distributions for hypothesis tests

Each hypothesis test relies on a specific probability distribution to determine whether your sample data provides enough evidence against the null hypothesis. Choosing the wrong distribution will give you the wrong p-value, so matching your situation to the right distribution matters.

  • Normal distribution (z-distribution)
    • Used to test population means when the population standard deviation is known or the sample size is large (n30n \geq 30)
    • Also used to test population proportions when the sample is large enough to satisfy normal approximation conditions (covered below)
  • Student's t-distribution
    • Used to test population means when the population standard deviation is unknown and the sample size is small (n<30n < 30). You estimate the population standard deviation using the sample standard deviation, which introduces extra uncertainty. The t-distribution accounts for that by having heavier tails than the normal distribution.
    • As sample size grows, the t-distribution looks more and more like the normal distribution.
  • Chi-square distribution
    • Used for goodness-of-fit tests, tests of independence, and tests of homogeneity
  • F-distribution
    • Used to compare two or more population variances (F-test) and to assess overall significance in regression analysis (ANOVA)
  • Binomial distribution
    • Used to test population proportions when the sample size is too small to meet the normal approximation conditions
Probability distributions for hypothesis tests, Pearson's chi-squared test - Wikipedia

Key assumptions of distribution tests

Every distribution comes with assumptions. If those assumptions aren't met, your test results can be unreliable.

  • Normal distribution assumptions
    • The data follows a normal distribution, or the sample size is large enough (n30n \geq 30) for the Central Limit Theorem to kick in
    • Observations are independent of each other
    • The population standard deviation is known
  • Student's t-distribution assumptions
    • The data follows a normal distribution, or the sample size is large enough (n30n \geq 30) for the Central Limit Theorem to apply
    • Observations are independent of each other
    • The population standard deviation is unknown (this is the key difference from the z-test)
  • Binomial distribution assumptions
    • Trials are independent of each other
    • Each trial has exactly two outcomes (success or failure)
    • The probability of success stays constant across all trials
    • The number of trials is fixed
Probability distributions for hypothesis tests, Standard score - wikidoc

Normal approximation in proportion tests

When testing a population proportion, you can use the normal distribution (z-test) instead of the binomial distribution, but only if the sample is large enough. Here's how to check:

  1. Independence: The sample must be a simple random sample from the population.
  2. Sample size: Both of these conditions must hold:
    • np10np \geq 10
    • n(1p)10n(1-p) \geq 10 where nn is the sample size and pp is the hypothesized population proportion.

For example, if you're testing whether a coin is fair (p=0.5p = 0.5) with n=40n = 40 flips, you'd check: 40×0.5=201040 \times 0.5 = 20 \geq 10 and 40×0.5=201040 \times 0.5 = 20 \geq 10. Both pass, so the normal approximation works here.

If both conditions are met, the sampling distribution of the sample proportion is approximately normal with:

  • Mean: μp^=p\mu_{\hat{p}} = p
  • Standard deviation: σp^=p(1p)n\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}

If either condition fails, you'd need to use the binomial distribution directly.

Hypothesis Testing Framework

These terms come up in every hypothesis test, regardless of which distribution you use.

  • Null hypothesis (H0H_0): The default assumption about a population parameter (e.g., "the population mean equals 50"). You assume it's true unless the data convinces you otherwise.
  • Alternative hypothesis (HaH_a): The claim you're testing against the null. It can be one-sided (>> or <<) or two-sided (\neq).
  • Significance level (α\alpha): A threshold you set before collecting data, typically 0.05. If your p-value falls below α\alpha, you reject the null hypothesis.
  • P-value: The probability of getting a test statistic at least as extreme as the one you observed, assuming the null hypothesis is true. A small p-value means your data is unlikely under H0H_0.
  • Type I error: Rejecting the null hypothesis when it's actually true. The probability of a Type I error equals α\alpha.
  • Confidence interval: A range of values likely to contain the true population parameter. If a hypothesized value falls outside the confidence interval, that's consistent with rejecting H0H_0 at the corresponding significance level.