โค๏ธโ€๐ŸฉนIntro to Public Health

Epidemiology Study Designs

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Understanding epidemiology study designs is fundamental to everything you'll encounter in public health. These designs help researchers evaluate whether a new vaccine works, figure out why certain communities experience higher rates of chronic disease, and justify interventions that affect millions of people.

You're being tested on your ability to recognize which design answers which type of question, understand the hierarchy of evidence, and identify the strengths and limitations of each approach. When you see a study claiming a link between an exposure and a disease, you need to ask: What design did they use? Can it actually prove what they're claiming? Don't just memorize the names. Know what each design can and cannot tell us, and when you'd choose one over another.


Observational Designs: Watching Without Intervening

Observational studies examine naturally occurring exposures and outcomes without the researcher changing anything. The key limitation across all of these is that you're observing associations, not controlling variables, so confounding is always a concern.

Cohort Studies

A cohort study starts by identifying a group of people based on their exposure status (exposed vs. unexposed) and then follows them over time to see who develops the outcome.

  • Prospective cohorts follow participants forward in real time, which reduces recall bias but can require years of follow-up and significant funding.
  • Retrospective cohorts use existing records to reconstruct what happened, making them faster and cheaper, though they depend on the quality of those records.
  • Because you're tracking who develops disease in each group, you can calculate relative risk (RR) and incidence rates directly. This makes cohort studies ideal for studying relatively common diseases and for establishing that the exposure came before the outcome (temporal sequence).

Case-Control Studies

Case-control studies work in the opposite direction. You start by identifying people who already have the disease (cases) and a comparable group who don't (controls), then look backward to compare their exposure histories.

  • This design is efficient for rare diseases because you don't need to follow thousands of people for years waiting for a handful of outcomes to develop.
  • You calculate odds ratios (OR), not relative risk. This distinction matters: the OR only approximates the RR when the disease is rare (the "rare disease assumption"). For common diseases, the OR can exaggerate the true association.
  • A major weakness is recall bias: cases may remember past exposures differently than controls because they've been searching for explanations for their illness.

Cross-Sectional Studies

A cross-sectional study is a snapshot of a population at a single point in time, measuring exposure and outcome simultaneously.

  • Because everything is measured at once, you generate prevalence data, not incidence. You can't tell whether the exposure came before the outcome or vice versa, so causality cannot be established.
  • These studies are ideal for needs assessments and hypothesis generation. For example, a cross-sectional survey might reveal that a community has unexpectedly high diabetes prevalence, prompting a cohort study to investigate why.

Compare: Cohort vs. Case-Control: both are observational and can assess exposure-outcome relationships, but cohort studies follow people forward (or reconstruct their history) from exposure status, while case-control studies work backward from disease status. If a question asks about studying a rare cancer, case-control is your answer. For tracking long-term outcomes of a common exposure, choose cohort.

Ecological Studies

Ecological studies analyze group-level data rather than individual-level data. They compare disease rates across countries, states, or time periods.

  • The central weakness is the ecological fallacy: an association observed at the population level may not hold true for individuals within those populations. For example, countries with higher fat consumption may have higher cancer rates, but that doesn't mean the individuals eating more fat are the ones getting cancer.
  • These studies are useful for hypothesis generation and for examining exposures that operate at the population level, like the effects of water fluoridation policies or air quality regulations.

Experimental Designs: Testing Interventions

Experimental studies involve the researcher actively manipulating a variable to test its effect. Randomization is the key feature that controls for confounding and allows causal inference.

Randomized Controlled Trials (RCTs)

RCTs are the gold standard for establishing causality. Participants are randomly assigned to either the intervention group or the control group, which makes the groups comparable at baseline and isolates the intervention's effect.

  • Blinding (also called masking) reduces bias. In a single-blind study, participants don't know which group they're in. In a double-blind study, neither participants nor researchers know, which is the strongest approach.
  • RCTs allow you to calculate effect size and number needed to treat (NNT), which tells you how many people need to receive the intervention for one additional person to benefit. These metrics help determine whether an intervention is worth implementing at scale.
  • The main limitations are cost, time, and ethics. You can't randomly assign people to harmful exposures (e.g., you can't make people smoke to study lung cancer), which is why observational designs remain essential.

Community and Field Trials

These are experimental designs where the unit of randomization differs from a standard RCT.

  • Community trials randomize entire groups (towns, schools, workplaces) rather than individuals. This is necessary when interventions can't be delivered individually, like adding fluoride to a water supply or implementing a school-wide nutrition program.
  • Field trials test preventive interventions in healthy populations in real-world settings, such as vaccine trials.
  • Both allow causal inference, but their results may have limited generalizability if the study conditions don't reflect typical real-world settings.

Compare: RCTs vs. Cohort Studies: both can follow participants over time, but RCTs randomize exposure while cohort studies observe natural exposure. RCTs establish causation; cohort studies establish strong associations with temporal sequence. This distinction is the foundation of the evidence hierarchy.


Descriptive and Preliminary Designs

These designs describe patterns, generate hypotheses, and identify emerging health concerns. They sit at the base of the evidence pyramid but are essential for recognizing new threats and guiding future research.

Case Reports and Case Series

  • A case report documents a single unusual patient presentation in detail. A case series compiles multiple similar cases.
  • These serve as the first alert system for new diseases and adverse effects. HIV/AIDS, vaping-related lung injury (EVALI), and numerous drug side effects were all first identified through case reports.
  • There's no comparison group, which means you can't calculate risk or draw conclusions about causation. Their value is purely descriptive, but they're invaluable for raising the alarm and generating hypotheses that more rigorous studies can test.

Longitudinal Studies

"Longitudinal" describes any study that takes repeated measurements over time. This includes both observational designs (like cohort studies) and experimental designs (like clinical trials).

  • The defining advantage is the ability to establish temporal sequence between exposure and outcome, which strengthens causal arguments.
  • The main threat is attrition bias: participants drop out over time, and if those who leave differ systematically from those who stay (e.g., sicker people dropping out), the results can be skewed.

Compare: Cross-Sectional vs. Longitudinal: both can be observational, but cross-sectional captures one moment while longitudinal tracks changes over time. Cross-sectional gives you prevalence; longitudinal gives you incidence and temporal relationships. If asked about disease trends or progression, longitudinal is the answer.


Evidence Synthesis: Combining What We Know

These approaches aggregate findings from multiple studies to draw stronger conclusions. They represent the highest level of evidence when done rigorously.

Systematic Reviews

A systematic review is a comprehensive, structured synthesis of the literature on a specific research question. It follows an explicit, pre-registered protocol to identify, evaluate, and summarize all relevant studies.

  • Using predetermined search strategies and inclusion/exclusion criteria reduces selection bias and makes the process transparent and reproducible.
  • The synthesis is qualitative, meaning it identifies patterns, gaps, and inconsistencies across the evidence base without necessarily combining numbers statistically.

Meta-Analyses

A meta-analysis takes the systematic review process a step further by statistically pooling data from multiple studies to calculate an overall effect estimate.

  • By aggregating sample sizes, meta-analyses increase statistical power, making it possible to detect effects that individual studies were too small to find.
  • The biggest threat is publication bias: if studies with negative or null results never get published, the pooled estimate will overstate the true effect. Researchers use tools like funnel plots to check for this.

Compare: Systematic Review vs. Meta-Analysis: a systematic review is the broader process of identifying and synthesizing literature. A meta-analysis is the statistical technique of combining data quantitatively. A meta-analysis is always part of a systematic review, but not all systematic reviews include a meta-analysis (sometimes the studies are too different to pool meaningfully).


Quick Reference Table

ConceptBest Examples
Establishing causalityRCTs, Experimental studies
Studying rare diseasesCase-control studies
Measuring incidence/relative riskCohort studies
Measuring prevalenceCross-sectional studies
Generating hypothesesEcological studies, Case reports, Cross-sectional studies
Detecting emerging threatsCase reports, Case series
Highest level of evidenceMeta-analyses, Systematic reviews
Temporal sequence without interventionLongitudinal studies, Cohort studies

Self-Check Questions

  1. A researcher wants to study risk factors for a rare childhood cancer. Which study design would be most efficient, and why can't they calculate relative risk directly?

  2. Compare and contrast prospective cohort studies and RCTs. What do they share, and what key feature distinguishes their ability to establish causation?

  3. A cross-sectional survey finds that people who exercise have lower rates of depression. Why can't we conclude that exercise prevents depression from this design alone?

  4. You're reviewing a meta-analysis that finds a strong protective effect of a supplement. What type of bias should you be concerned about, and how might it affect the findings?

  5. An outbreak of a mysterious respiratory illness is affecting healthcare workers. Which study design would you recommend as the first step, and which would you recommend to identify risk factors once you have enough cases?

Epidemiology Study Designs to Know for Intro to Public Health