Chi-Square Test of Independence
Chi-Square Test for Categorical Relationships
The chi-square test of independence tells you whether two categorical variables are related or just appear that way by chance. Categorical variables have two or more categories with no inherent order, like gender, race, or political affiliation.
The hypotheses are straightforward:
- Null hypothesis (): The two categorical variables are independent (no relationship).
- Alternative hypothesis (): The two categorical variables are dependent (there is a relationship).
If the null hypothesis is true, the observed frequencies in your data should be close to what you'd expect by chance alone. The test works by measuring how far your observed data strays from those expected values.
Steps to conduct the test:
- Build a contingency table with the observed frequencies for each combination of the two variables.
- Calculate expected frequencies for every cell:
- Calculate the chi-square test statistic by summing across all cells:
where is the observed frequency and is the expected frequency. 4. Find the degrees of freedom:
- Compare your test statistic to the critical value from the chi-square distribution table at your chosen significance level (usually ):
- If > critical value → reject and conclude there's a significant association.
- If < critical value → fail to reject . There isn't enough evidence to claim an association.

Interpretation of Chi-Square Results
The p-value is the probability of getting a chi-square statistic as extreme as (or more extreme than) what you observed, assuming is true.
- A small p-value (typically < 0.05) means strong evidence against . You conclude the variables are associated.
- A large p-value (> 0.05) means weak evidence against . You can't conclude the variables are associated.
Degrees of freedom affect which chi-square distribution you use and therefore which critical value you compare against. A 3×4 contingency table, for example, gives you .
When reporting results, always include:
- The chi-square test statistic ()
- Degrees of freedom (df)
- The p-value
- Your decision: reject or fail to reject
For example, you might write: "A chi-square test of independence showed a significant association between political affiliation and opinion on the policy, , ."

Limitations of Chi-Square Tests
The test relies on several assumptions that must be met:
- The sample is randomly selected from the population.
- The sample size is large enough that every cell's expected frequency is at least 5.
- Both variables are categorical.
- Observations are independent of each other (one person's response doesn't influence another's).
Even when assumptions are met, the test has real limitations:
- No strength or direction. A significant result tells you the variables are related, but not how strongly or in what direction. Measures like Cramér's V or the phi coefficient can fill that gap by quantifying effect size.
- Sensitive to sample size. With a very large sample, even a trivially small association can produce a statistically significant result. Always consider practical significance alongside statistical significance.
- No control for confounding variables. If a third variable is driving the relationship between your two variables, the chi-square test won't catch that.
- Categories must be mutually exclusive and exhaustive. If a person could fall into more than one category, or if your categories don't cover all possibilities, the results can be misleading.
Additional Considerations in Chi-Square Analysis
- Contingency analysis is the broader method of examining relationships between categorical variables using a contingency table. The chi-square test of independence is the main statistical tool within this framework.
- Statistical inference is the underlying principle at work here: you're using sample data to draw conclusions about a larger population.
- Post-hoc analysis comes into play after you get a significant chi-square result. The overall test tells you something is going on, but post-hoc tests (like examining standardized residuals for each cell) help you pinpoint which specific categories are driving the association.