The chi-square test of independence tells you whether two categorical variables are related or just appear that way by chance. Categorical variables have two or more categories with no inherent order, like gender, race, or political affiliation.

The hypotheses are straightforward:

Null hypothesis ( $H_0$ ): The two categorical variables are independent (no relationship).
Alternative hypothesis ( $H_a$ ): The two categorical variables are dependent (there is a relationship).

If the null hypothesis is true, the observed frequencies in your data should be close to what you'd expect by chance alone. The test works by measuring how far your observed data strays from those expected values.

Steps to conduct the test:

Build a contingency table with the observed frequencies for each combination of the two variables.
Calculate expected frequencies for every cell: $E = \frac{(\text{row total}) \times (\text{column total})}{\text{grand total}}$
Calculate the chi-square test statistic by summing across all cells: $\chi^2 = \sum \frac{(O - E)^2}{E}$

where $O$ is the observed frequency and $E$ is the expected frequency. 4. Find the degrees of freedom: $\text{df} = (\text{number of rows} - 1) \times (\text{number of columns} - 1)$

Compare your test statistic to the critical value from the chi-square distribution table at your chosen significance level (usually $\alpha = 0.05$ $α = 0.05$ ):
- If $\chi^2$ > critical value → reject $H_0$ and conclude there's a significant association.
- If $\chi^2$ < critical value → fail to reject $H_0$ . There isn't enough evidence to claim an association.

Chi-square test for categorical relationships, Test of Independence (2 of 3) | Concepts in Statistics

Interpretation of Chi-Square Results

The p-value is the probability of getting a chi-square statistic as extreme as (or more extreme than) what you observed, assuming $H_0$ is true.

A small p-value (typically < 0.05) means strong evidence against $H_0$ . You conclude the variables are associated.
A large p-value (> 0.05) means weak evidence against $H_0$ . You can't conclude the variables are associated.

Degrees of freedom affect which chi-square distribution you use and therefore which critical value you compare against. A 3×4 contingency table, for example, gives you $\text{df} = (3-1)(4-1) = 6$ .

When reporting results, always include:

The chi-square test statistic ( $\chi^2$ )
Degrees of freedom (df)
The p-value
Your decision: reject or fail to reject $H_0$

For example, you might write: "A chi-square test of independence showed a significant association between political affiliation and opinion on the policy, $\chi^2(2) = 11.34$ , $p = 0.003$ ."

Chi-square test for categorical relationships, Test of Independence (1 of 3) | Concepts in Statistics

Limitations of Chi-Square Tests

The test relies on several assumptions that must be met:

The sample is randomly selected from the population.
The sample size is large enough that every cell's expected frequency is at least 5.
Both variables are categorical.
Observations are independent of each other (one person's response doesn't influence another's).

Even when assumptions are met, the test has real limitations:

No strength or direction. A significant result tells you the variables are related, but not how strongly or in what direction. Measures like Cramér's V or the phi coefficient can fill that gap by quantifying effect size.
Sensitive to sample size. With a very large sample, even a trivially small association can produce a statistically significant result. Always consider practical significance alongside statistical significance.
No control for confounding variables. If a third variable is driving the relationship between your two variables, the chi-square test won't catch that.
Categories must be mutually exclusive and exhaustive. If a person could fall into more than one category, or if your categories don't cover all possibilities, the results can be misleading.

Additional Considerations in Chi-Square Analysis

Contingency analysis is the broader method of examining relationships between categorical variables using a contingency table. The chi-square test of independence is the main statistical tool within this framework.
Statistical inference is the underlying principle at work here: you're using sample data to draw conclusions about a larger population.
Post-hoc analysis comes into play after you get a significant chi-square result. The overall test tells you something is going on, but post-hoc tests (like examining standardized residuals for each cell) help you pinpoint which specific categories are driving the association.