Key Characteristics of the Chi-Square Distribution
The chi-square distribution is a continuous probability distribution used to analyze categorical data. It shows up in two major hypothesis tests you'll encounter in this unit: the goodness-of-fit test and the test of independence. Getting a solid handle on its shape and properties now will make those tests much easier to work with.

Shape and Properties
The chi-square distribution is defined only for positive values, ranging from 0 to positive infinity. It can never be negative. Its shape depends entirely on a single parameter: degrees of freedom ().
- With low , the distribution is strongly skewed to the right (positively skewed)
- As increases, the distribution becomes more symmetrical and bell-shaped
- The distribution is always positively skewed, but that skewness shrinks as grows
- It's technically a special case of the gamma distribution

Mean, Variance, and Standard Deviation
The formulas here are unusually clean compared to most distributions:
- Mean:
- Variance:
- Standard deviation:
So if , the mean is 10, the variance is 20, and the standard deviation is . Notice that the mean sits at the degrees of freedom, which means the center of the distribution shifts right as increases.

Chi-Square vs. Normal Distribution
As increases, the chi-square distribution starts to resemble a normal distribution. The usual threshold is : beyond that point, a normal approximation works reasonably well.
When you use this approximation, the normal distribution has:
- Mean:
- Variance:
- Standard deviation:
These are the same formulas from above. The only thing that changes is you're now treating the distribution as if it were normal, which lets you use z-scores and normal probability tables.
Applications in Statistical Analysis
The chi-square statistic is built on a specific idea: measuring how far observed data falls from what you'd expect. It sums up the squared, standardized differences between observed and expected frequencies.
- Goodness-of-fit test: Checks whether observed data matches a proposed distribution. For example, do the colors in a bag of candy appear in the proportions the company claims?
- Test of independence: Examines whether two categorical variables are related. For example, is there a relationship between gender and preferred study method?
The chi-square statistic was introduced by Karl Pearson, and these tests remain some of the most widely used tools for working with categorical data.