upgrade
upgrade

🫁Intro to Biostatistics

Common Statistical Tests

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Choosing the right statistical test is one of the most critical decisions you'll make in biostatistics—and it's exactly what exams love to test. You're not just being asked to memorize formulas; you're being evaluated on whether you understand when to use each test based on your data type, sample design, and research question. The tests in this guide fall into predictable patterns: comparing means vs. analyzing relationships, parametric vs. non-parametric approaches, and independent vs. paired designs.

Think of statistical tests as tools in a toolkit—each designed for a specific job. A t-test won't help you with categorical data any more than a hammer helps with screws. Master the decision logic behind test selection: What type of variable do I have? How many groups am I comparing? Does my data meet parametric assumptions? Don't just memorize that a Chi-square test exists—know that it's your go-to when both variables are categorical and you're testing independence.


Comparing Means Between Groups

When your research question asks "Is there a difference between groups?" and your outcome is continuous, you need a test that compares means. The choice depends on how many groups you're comparing and whether your data meets normality assumptions.

t-test (Independent and Paired)

  • Compares means of exactly two groups—the fundamental test for determining if group differences are statistically significant or due to chance
  • Independent t-test applies when groups are separate (treatment vs. control), while paired t-test handles related measurements (before vs. after in the same subjects)
  • Assumes normal distribution and equal variances; violations push you toward non-parametric alternatives like Mann-Whitney U

ANOVA (One-Way and Two-Way)

  • Extends mean comparison to three or more groups—use when t-tests would require multiple comparisons and inflate Type I error
  • One-way ANOVA tests one independent variable; two-way ANOVA examines two factors plus their interaction effect
  • F-statistic indicates overall significance, but post-hoc tests (e.g., Tukey's HSD) are required to identify which specific groups differ

F-test

  • Compares variances between groups—determines whether the spread of data differs significantly across populations
  • Underlies ANOVA calculations and tests the homogeneity of variances assumption required for parametric tests
  • Sensitive to non-normality—violations can lead to incorrect conclusions about variance equality

Compare: t-test vs. ANOVA—both compare means of continuous outcomes, but t-tests handle exactly two groups while ANOVA handles three or more. If an exam asks which test to use for comparing blood pressure across four treatment arms, ANOVA is your answer.


Non-Parametric Alternatives

When your data violates normality assumptions, uses ordinal scales, or involves small samples, non-parametric tests provide robust alternatives. These tests work with ranks rather than raw values, making them distribution-free.

Mann-Whitney U Test

  • Non-parametric alternative to the independent t-test—compares two unrelated groups when data are ordinal or non-normally distributed
  • Ranks all observations across both groups and compares the sum of ranks; produces a U statistic and p-value
  • Ideal for small samples or when you can't justify normality assumptions—common in pilot studies and clinical research

Wilcoxon Signed-Rank Test

  • Non-parametric alternative to the paired t-test—handles related samples when difference scores aren't normally distributed
  • Ranks absolute differences between pairs while preserving the sign (positive or negative) of each difference
  • W statistic reflects whether positive or negative ranks dominate; useful for before-after designs with non-normal data

Kruskal-Wallis Test

  • Non-parametric alternative to one-way ANOVA—compares three or more independent groups without normality assumptions
  • Ranks all data points across groups and produces an H statistic to assess whether rank distributions differ
  • Follow-up pairwise comparisons needed to identify which groups differ—similar logic to ANOVA post-hoc testing

Compare: Mann-Whitney U vs. Wilcoxon signed-rank—both are non-parametric and rank-based, but Mann-Whitney handles independent groups while Wilcoxon handles paired/related samples. This mirrors the independent vs. paired t-test distinction. FRQ tip: If the question mentions "matched pairs" or "same subjects measured twice" with non-normal data, Wilcoxon is correct.


Analyzing Relationships Between Variables

These tests ask "How are variables related?" rather than "Are groups different?" The choice depends on whether you're measuring association, predicting outcomes, or modeling probabilities.

Correlation Analysis

  • Quantifies the strength and direction of linear relationships—Pearson's rr ranges from 1-1 (perfect negative) to +1+1 (perfect positive)
  • Does not imply causation—this is perhaps the most tested concept in introductory biostatistics; correlation ≠ causation
  • Assumes linearity and homoscedasticity; visualize with scatter plots to verify assumptions before interpreting rr

Linear Regression

  • Models a continuous outcome as a function of one or more predictors—moves beyond correlation to prediction and explanation
  • Coefficients indicate the expected change in YY for a one-unit change in XX; R2R^2 measures proportion of variance explained
  • Assumes linear relationship, independence of errors, and homoscedasticity; check residual plots to verify assumptions

Logistic Regression

  • Models binary outcomes (yes/no, disease/no disease)—essential for predicting probabilities in clinical and epidemiological research
  • Odds ratios quantify how predictors affect the likelihood of the outcome; an OR > 1 indicates increased odds
  • No normality assumption for the outcome; model fit assessed via likelihood ratio test and pseudo-R2R^2 values

Compare: Linear vs. logistic regression—both model relationships between predictors and outcomes, but linear regression requires a continuous dependent variable while logistic regression handles binary outcomes. If the outcome is "survived vs. died" or "positive vs. negative test," logistic regression is required.


Categorical Data Analysis

When both your variables are categorical (nominal or ordinal), you need tests designed for frequency data rather than means.

Chi-Square Test

  • Tests association between categorical variables—compares observed frequencies to expected frequencies under independence
  • Goodness-of-fit version tests whether observed proportions match a theoretical distribution; test of independence examines relationships in contingency tables
  • Requires expected cell frequencies ≥ 5—small samples may need Fisher's exact test instead; produces χ2\chi^2 statistic and p-value

Compare: Chi-square vs. correlation—both assess relationships, but Chi-square handles categorical-categorical associations while correlation handles continuous-continuous relationships. Exam trap: Don't use correlation for variables like "smoker/non-smoker" and "disease/no disease"—that's a Chi-square question.


Quick Reference Table

ConceptBest Examples
Comparing two group means (parametric)Independent t-test, Paired t-test
Comparing three+ group means (parametric)One-way ANOVA, Two-way ANOVA
Non-parametric two-group comparisonMann-Whitney U (independent), Wilcoxon signed-rank (paired)
Non-parametric three+ group comparisonKruskal-Wallis test
Variance comparisonF-test
Continuous outcome predictionLinear regression, Correlation analysis
Binary outcome predictionLogistic regression
Categorical variable associationChi-square test

Self-Check Questions

  1. A researcher wants to compare pain scores (ordinal scale) between three treatment groups with small, non-normally distributed samples. Which test should they use, and why is ANOVA inappropriate?

  2. Compare and contrast the Mann-Whitney U test and the Wilcoxon signed-rank test. What study design features determine which one to use?

  3. You're analyzing whether smoking status (yes/no) is associated with lung cancer diagnosis (yes/no). Which test is appropriate, and what assumption must be checked before proceeding?

  4. A clinical trial measures blood glucose before and after a new medication in the same 50 patients. Data appear normally distributed. Which test should be used? What would change your answer to a non-parametric alternative?

  5. An FRQ presents regression output with an odds ratio of 2.3 for a predictor variable. What type of regression produced this output, and how would you interpret this odds ratio in context?