upgrade
upgrade

📊Advanced Quantitative Methods

Key Nonparametric Statistical Tests

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Nonparametric tests are your statistical safety net—they're what you reach for when your data refuses to cooperate with the neat assumptions of parametric methods. You're being tested on your ability to recognize when these tests are appropriate (ordinal data, non-normal distributions, small samples, violated assumptions) and which specific test matches your research design. The underlying principle is straightforward: instead of relying on population parameters like means and standard deviations, these tests work with ranks, signs, and frequencies.

Don't just memorize test names—know what each test is actually doing with your data. Can you identify whether you're comparing independent groups or related samples? Are you looking at two groups or three or more? Is your data categorical or ordinal? These distinctions determine which test you'll select, and exam questions will absolutely probe whether you understand the logic behind each choice.


Comparing Two Independent Groups

When you have two separate groups and want to know if they differ, but your data isn't normally distributed, you need tests that compare distributions through ranks rather than means.

Mann-Whitney U Test

  • Compares two independent groups—the nonparametric alternative to the independent samples t-test when normality assumptions fail
  • Ranks all observations together across both groups, then calculates the UU statistic based on how the ranks distribute between groups
  • Robust to small samples and ordinal data; tests whether one group tends to have larger values than the other

Kolmogorov-Smirnov Test

  • Compares entire distributions—either a sample against a theoretical distribution or two samples against each other
  • Detects differences in shape, not just central tendency; examines the maximum distance between cumulative distribution functions
  • Commonly used for normality testing—helps you decide whether parametric methods are even appropriate in the first place

Compare: Mann-Whitney U vs. Kolmogorov-Smirnov—both can compare two independent samples, but Mann-Whitney focuses on whether one group tends to score higher while K-S detects any distributional difference including shape and spread. If an FRQ asks about testing normality assumptions, K-S is your answer.


When your observations are paired—before/after measurements, matched subjects, or repeated measures on the same individuals—you need tests that account for this dependency structure.

Wilcoxon Signed-Rank Test

  • Tests paired differences—the nonparametric alternative to the paired t-test when difference scores aren't normally distributed
  • Ranks the absolute differences between pairs, then compares the sum of positive ranks to negative ranks
  • Considers magnitude of differences through ranking, making it more powerful than simpler alternatives when assumptions are met

Sign Test

  • Focuses only on direction—counts whether differences are positive or negative, ignoring magnitude entirely
  • Extremely simple and robust—useful when you can't even trust the ranking of differences, only their signs
  • Lower statistical power than Wilcoxon because it discards information about how large the differences are

McNemar's Test

  • For paired nominal data—specifically designed for dichotomous outcomes in matched or before/after designs
  • Analyzes the discordant pairs—only cases where responses changed contribute to the test statistic
  • Common in clinical research—"Did treatment change the proportion who responded?" is a classic McNemar's question

Compare: Wilcoxon Signed-Rank vs. Sign Test—both handle paired data, but Wilcoxon uses ranked magnitudes while Sign Test uses only direction. Wilcoxon is more powerful when you trust the ordering of differences; Sign Test is your fallback when data is truly minimal or ordinal categories are very coarse.


Comparing Three or More Groups

Extending beyond two-group comparisons requires tests that can handle multiple groups simultaneously while maintaining appropriate Type I error control.

Kruskal-Wallis Test

  • Extends Mann-Whitney to kk groups—the nonparametric alternative to one-way ANOVA for independent samples
  • Ranks all observations together across all groups, then tests whether rank distributions differ significantly using the HH statistic
  • Requires post-hoc testing if significant—tells you groups differ somewhere, but not which specific pairs

Friedman Test

  • For repeated measures across kk conditions—the nonparametric alternative to repeated measures ANOVA
  • Ranks within each subject across conditions, then tests whether the average ranks differ across conditions
  • Handles blocked designs—each subject serves as their own control, with ranks assigned within each block

Compare: Kruskal-Wallis vs. Friedman—both compare three or more groups, but Kruskal-Wallis is for independent groups while Friedman is for related samples or repeated measures. This parallels the one-way ANOVA vs. repeated measures ANOVA distinction in parametric testing.


Measuring Association Between Variables

When you need to quantify relationships rather than test group differences, correlation coefficients adapted for ranks provide robust alternatives to Pearson's rr.

Spearman's Rank Correlation Coefficient

  • Measures monotonic association—calculates Pearson's rr on the ranks of the data, denoted rsr_s or ρ\rho
  • Detects any consistent increase/decrease pattern—not limited to linear relationships like Pearson's correlation
  • Sensitive to ties—many tied values can distort the coefficient, requiring correction formulas

Kendall's Tau

  • Compares concordant and discordant pairs—counts whether pairs of observations are ranked in the same order on both variables
  • More robust to ties than Spearman's and often preferred for small samples or heavily tied data
  • Interpretable as a probabilityτ\tau represents the difference between the probability of concordance and discordance

Compare: Spearman's rsr_s vs. Kendall's τ\tau—both measure rank-based association, but Spearman applies Pearson's formula to ranks while Kendall counts concordant pairs. Kendall's is more robust with ties and small samples; Spearman's is more directly comparable to Pearson's rr and often has slightly more power with larger samples.


Testing Categorical Associations

When both variables are categorical (nominal), you need tests designed for frequency data rather than ranked observations.

Chi-Square Test of Independence

  • Tests association in contingency tables—compares observed cell frequencies to expected frequencies under independence
  • Requires adequate expected frequencies—generally 5\geq 5 per cell; small samples may need Fisher's exact test instead
  • Widely applicable—works for any size table, though interpretation becomes complex with many categories

Compare: Chi-Square vs. McNemar's—both involve categorical data, but Chi-Square tests independence between two variables while McNemar's tests change in paired nominal data. Chi-Square is for independent observations; McNemar's is for matched or repeated measures designs.


Quick Reference Table

ConceptBest Examples
Two independent groupsMann-Whitney U, Kolmogorov-Smirnov
Two related samples (ordinal)Wilcoxon Signed-Rank, Sign Test
Two related samples (nominal)McNemar's Test
Three+ independent groupsKruskal-Wallis
Three+ related samplesFriedman Test
Rank-based correlationSpearman's rsr_s, Kendall's τ\tau
Categorical associationChi-Square Test of Independence
Distribution comparison/normalityKolmogorov-Smirnov

Self-Check Questions

  1. You have pre- and post-intervention scores from the same 15 participants, but the difference scores are heavily skewed. Which test should you use, and why would you choose it over the Sign Test?

  2. A researcher wants to compare customer satisfaction ratings (on a 5-point scale) across four different store locations with different customers at each location. Which test is appropriate, and what is its parametric equivalent?

  3. Both Spearman's rsr_s and Kendall's τ\tau measure association between ranked variables. Under what conditions would you prefer Kendall's τ\tau?

  4. Compare and contrast the Mann-Whitney U test and the Kruskal-Wallis test. What key design feature determines which one you should use?

  5. A clinical trial measures whether patients switched from "not improved" to "improved" (or vice versa) after treatment. The data is nominal and paired. Which test should you use, and what specifically does this test analyze?