Fiveable

🎲Intro to Statistics Unit 11 Review

QR code for Intro to Statistics practice questions

11.1 Facts About the Chi-Square Distribution

11.1 Facts About the Chi-Square Distribution

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🎲Intro to Statistics
Unit & Topic Study Guides

Key Characteristics and Properties

The chi-square distribution is a probability distribution used to analyze categorical data. You'll rely on it for goodness-of-fit tests and for checking whether two categorical variables are related. It behaves differently from the normal distribution in some important ways, so understanding its shape and properties matters.

Shape and Behavior

The chi-square distribution is a continuous distribution defined by a single parameter: degrees of freedom (dfdf), which must be a positive integer.

A few defining features:

  • It only takes non-negative values, ranging from 0 to positive infinity. You'll never get a negative chi-square value.
  • It's right-skewed (positively skewed). The tail stretches out to the right.
  • The skewness depends on dfdf. When df=1df = 1, the curve is heavily skewed. As dfdf increases, the peak shifts rightward and the curve becomes more symmetric, gradually approaching the shape of a normal distribution.

Think of it this way: a chi-square distribution with 3 degrees of freedom looks very lopsided, but one with 50 degrees of freedom looks close to a bell curve.

Characteristics of chi-square distribution, Chi-square distribution - wikidoc

Mean, Variance, and Standard Deviation

These formulas are straightforward and worth memorizing:

  • Mean: μ=df\mu = df
  • Variance: σ2=2(df)\sigma^2 = 2(df)
  • Standard deviation: σ=2(df)\sigma = \sqrt{2(df)}

So for a chi-square distribution with df=10df = 10, the mean is 10, the variance is 20, and the standard deviation is 204.47\sqrt{20} \approx 4.47.

Notice that the mean equals the degrees of freedom. That's a quick way to check your work: the center of the distribution should sit right at dfdf.

Characteristics of chi-square distribution, Pearson's chi-squared test - Wikipedia

Relationship to the Normal Distribution

The chi-square distribution is built from the normal distribution. If you take independent standard normal random variables Z1,Z2,,ZnZ_1, Z_2, \ldots, Z_n and square each one, their sum follows a chi-square distribution:

i=1nZi2χ2(n)\sum_{i=1}^{n} Z_i^2 \sim \chi^2(n)

This means a chi-square distribution with nn degrees of freedom is literally the sum of nn squared standard normal values.

A practical consequence: when dfdf is large (roughly df>30df > 30), the chi-square distribution can be approximated by a normal distribution with μ=df\mu = df and σ=2(df)\sigma = \sqrt{2(df)}. This is why chi-square tables sometimes stop at moderate dfdf values; beyond that, you can use the normal approximation.

Applications in Statistical Inference

You'll encounter the chi-square distribution in two main types of tests in this course:

  • Goodness-of-fit tests compare observed frequencies to expected frequencies. For example, you might test whether the distribution of colors in a bag of candy matches what the manufacturer claims.
  • Tests of independence use contingency tables to assess whether two categorical variables are related. For instance, you could test whether preferred study method (flashcards, re-reading, practice problems) is independent of grade level.

In both cases, you calculate a chi-square test statistic, then compare it to a critical value from the chi-square distribution to decide whether to reject the null hypothesis. Larger test statistic values fall further into the right tail, making rejection of the null hypothesis more likely.