🤝Collaborative Data Science

Statistical Tests for Data Analysis

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Statistical tests are the backbone of reproducible data science—they transform raw observations into defensible conclusions. In collaborative environments, your team needs shared language around when to use parametric vs. non-parametric tests, how to compare groups, and what assumptions must hold for results to be valid. Misapplying a test or violating its assumptions can invalidate entire analyses, waste computational resources, and erode trust in your findings.

You're being tested on more than memorizing formulas. Exams and real-world collaborations demand that you choose the right test for the data type and research question, verify assumptions before running analyses, and interpret outputs in context. Don't just memorize that a t-test compares means—know why you'd pick it over a Mann-Whitney U, and what breaks when normality fails.

Comparing Group Means (Parametric)

These tests assume your data follows a normal distribution and compare central tendencies across groups. The underlying principle is that if groups come from populations with identical means, observed differences should fall within predictable sampling variability.

T-Test

Compares means between two groups—determines whether observed differences are statistically significant or likely due to chance
Three variants serve different designs: independent (separate groups), paired (same subjects measured twice), and one-sample (group vs. known value)
Assumes normality and equal variances—violations push you toward non-parametric alternatives like Mann-Whitney U

ANOVA (Analysis of Variance)

Extends mean comparison to three or more groups—tests whether at least one group mean differs significantly from the others
One-way vs. two-way: one-way handles a single factor, two-way examines two factors plus their interaction
Uses the F-statistic internally—partitions total variance into between-group and within-group components

F-Test

Compares variances between groups—assesses whether variability differs significantly across populations
Foundation of ANOVA calculations—the F-statistic is the ratio of between-group variance to within-group variance
Critical for model validation—used in regression to test whether predictors collectively explain significant variance

Compare: T-test vs. ANOVA—both compare means assuming normality, but t-tests handle only two groups while ANOVA scales to three or more. If an FRQ asks you to compare multiple treatment conditions, ANOVA is your tool; reserve t-tests for pairwise follow-ups.

Comparing Groups (Non-Parametric)

When normality assumptions fail or you're working with ordinal data, these rank-based tests provide robust alternatives. They convert raw values to ranks, making them resistant to outliers and skewed distributions.

Mann-Whitney U Test

Non-parametric alternative to the independent t-test—compares two groups without assuming normal distributions
Ranks all observations together—then tests whether one group's ranks tend to be systematically higher or lower
Ideal for ordinal data or small samples—maintains validity when parametric assumptions are violated

Kruskal-Wallis Test

Non-parametric alternative to one-way ANOVA—compares three or more independent groups using ranks
Tests whether samples share the same distribution—significant results indicate at least one group differs in central tendency
No normality requirement—use when your data is ordinal, heavily skewed, or has unequal variances across groups

Compare: Mann-Whitney U vs. Kruskal-Wallis—both are rank-based and assumption-light, but Mann-Whitney handles two groups while Kruskal-Wallis extends to three or more. Think of Kruskal-Wallis as "non-parametric ANOVA."

Modeling Relationships (Regression)

Regression methods quantify how predictor variables relate to outcomes, enabling prediction and inference. The core idea is fitting a mathematical function that minimizes the discrepancy between predicted and observed values.

Linear Regression

Models a continuous outcome as a linear function of one predictor—estimates slope ( $\beta_1$ ) and intercept ( $\beta_0$ ) in $Y = \beta_0 + \beta_1 X + \epsilon$
Four key assumptions: linearity, independence of errors, homoscedasticity (constant variance), and normality of residuals
Widely used for forecasting—the slope quantifies how much $Y$ changes per unit increase in $X$

Multiple Regression

Extends linear regression to multiple predictors—models outcome as $Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + ... + \epsilon$
Controls for confounding variables—isolates each predictor's unique contribution while holding others constant
Interpretation grows complex—coefficients represent partial effects, and multicollinearity between predictors can destabilize estimates

Logistic Regression

Models binary outcomes—predicts the probability of an event (e.g., success/failure, yes/no) rather than a continuous value
Outputs odds ratios—a one-unit increase in a predictor multiplies the odds of the outcome by $e^{\beta}$
Assumes linearity in the logit—the log-odds of the outcome must be linearly related to predictors

Compare: Linear vs. Logistic Regression—linear regression predicts continuous outcomes; logistic regression predicts probabilities for categorical outcomes. If your dependent variable is binary (pass/fail, churned/retained), logistic is required—linear regression would predict impossible values outside 0-1.

Measuring Association

These methods assess whether and how strongly variables relate to each other, without necessarily implying one causes the other. Association tests help identify patterns worth investigating further.

Correlation Analysis

Quantifies strength and direction of linear relationships—Pearson's $r$ ranges from $-1$ (perfect negative) to $+1$ (perfect positive)
Correlation does not imply causation—two variables can move together due to a shared confounder or pure coincidence
Essential for exploratory analysis—quickly identifies which variable pairs warrant deeper investigation

Chi-Square Test

Tests association between categorical variables—compares observed cell frequencies to expected frequencies under independence
Two main applications: test of independence (contingency tables) and goodness-of-fit (observed vs. theoretical distribution)
Requires adequate sample size—expected frequencies below 5 in any cell can invalidate results

Compare: Correlation vs. Chi-Square—correlation measures relationships between continuous variables; chi-square tests associations between categorical variables. Choosing between them depends entirely on your data types, not your research question.

Quick Reference Table

Concept	Best Examples
Comparing two group means (parametric)	T-test
Comparing 3+ group means (parametric)	ANOVA, F-test
Comparing groups (non-parametric)	Mann-Whitney U, Kruskal-Wallis
Predicting continuous outcomes	Linear regression, Multiple regression
Predicting binary outcomes	Logistic regression
Measuring continuous variable association	Correlation analysis
Testing categorical variable association	Chi-square test
Assumption-free alternatives	Mann-Whitney U, Kruskal-Wallis

Self-Check Questions

You have three treatment groups and your data is heavily right-skewed. Which test should you use instead of ANOVA, and why?
Compare and contrast linear regression and logistic regression—when would using linear regression on a binary outcome produce misleading results?
A colleague runs a chi-square test on two continuous variables. What's wrong with this approach, and which test should they use instead?
Both the t-test and Mann-Whitney U compare two groups. What specific data conditions would make you choose Mann-Whitney U over a t-test?
If an FRQ asks you to "control for confounding variables" when examining the effect of study hours on exam scores, which regression approach allows this, and how does it isolate each predictor's effect?

🤝Collaborative Data Science

Statistical Tests for Data Analysis

Why This Matters

Comparing Group Means (Parametric)

T-Test

ANOVA (Analysis of Variance)

F-Test

Comparing Groups (Non-Parametric)

Mann-Whitney U Test

Kruskal-Wallis Test

Modeling Relationships (Regression)

Linear Regression

Multiple Regression

Logistic Regression

Measuring Association

Correlation Analysis

Chi-Square Test

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes