upgrade
upgrade

🦠Epidemiology

Epidemiological Study Designs

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Understanding study designs is foundational to everything you'll encounter in epidemiology—you can't interpret a research finding, evaluate an intervention, or critique a public health policy without knowing how the evidence was generated. Every study design comes with trade-offs between internal validity (how confident we can be about causation), external validity (how well findings generalize), and feasibility (time, cost, and ethical constraints). These trade-offs appear constantly on exams, especially when you're asked to recommend an appropriate design for a given research question.

You're being tested on your ability to match study designs to research scenarios, identify potential biases, and interpret the appropriate measures of association (relative risk, odds ratios, prevalence). Don't just memorize definitions—know what each design can and cannot tell you about causation, which biases threaten each design, and when one approach is preferred over another.


Experimental Designs: Testing Interventions

These designs involve researcher-controlled manipulation of exposures or interventions. Because investigators assign participants to groups, experimental designs offer the strongest evidence for causation—but they come with significant ethical and practical constraints.

Randomized Controlled Trials (RCTs)

  • Gold standard for causal inference—random assignment eliminates confounding by distributing both known and unknown variables equally across groups
  • Measures efficacy directly through comparison of outcomes between intervention and control groups, allowing calculation of absolute and relative risk reductions
  • Ethical limitations restrict use—cannot randomize participants to harmful exposures (smoking, toxins), making RCTs unsuitable for many etiologic questions

Quasi-Experimental Studies

  • No randomization—uses pre-existing groups or natural experiments when random assignment is infeasible or unethical
  • Real-world applicability makes these designs essential for evaluating policy changes, community interventions, and program implementations
  • Confounding is the major threat—without randomization, systematic differences between groups may explain observed effects rather than the intervention itself

Compare: RCTs vs. Quasi-experimental studies—both test interventions, but RCTs eliminate confounding through randomization while quasi-experiments must address it through design or analysis. If an exam asks which design provides stronger causal evidence, RCT wins; if it asks what's practical for evaluating a new public health policy, quasi-experimental is your answer.


Observational Analytic Designs: Following Exposure to Outcome

These designs observe natural variation in exposures without manipulation. The key distinction is directionality—do you start with exposure status and follow forward, or start with disease status and look backward?

Cohort Studies

  • Prospective design follows exposed and unexposed groups forward—establishes temporal sequence definitively, which is essential for causal inference
  • Calculates incidence and relative risk directly—the only observational design that can measure true disease rates in exposed vs. unexposed populations
  • Resource-intensive and inefficient for rare diseases—may require following thousands of participants for years to observe enough outcomes

Case-Control Studies

  • Works backward from disease—compares exposure histories between cases (with disease) and controls (without disease)
  • Efficient for rare outcomes—can study diseases with incidence of 1 in 100,000 without needing massive sample sizes
  • Calculates odds ratios, not relative risk—and is vulnerable to recall bias (cases remember exposures differently) and selection bias (controls may not represent source population)

Nested Case-Control Studies

  • Hybrid design embedded within a cohort—cases arise from the cohort, controls sampled from same population at risk
  • Reduces bias inherent to traditional case-control—exposure data collected prospectively before disease onset eliminates recall bias
  • Cost-efficient for expensive biomarker analyses—only need to process samples from cases and selected controls rather than entire cohort

Compare: Cohort vs. Case-control—cohort studies follow exposure forward and calculate relative risk; case-control studies work backward from disease and calculate odds ratios. For rare diseases, case-control is practical; for rare exposures, cohort is preferred. FRQs often ask you to justify design choice based on disease rarity.


Descriptive and Hypothesis-Generating Designs

These designs describe patterns and generate hypotheses but cannot establish causation. They're the starting point of epidemiologic investigation, not the endpoint.

Cross-Sectional Studies

  • Snapshot design measures exposure and outcome simultaneously—useful for estimating prevalence and identifying associations at a single point in time
  • Cannot establish temporality—because exposure and outcome are measured together, you cannot determine which came first
  • Efficient for planning and surveillance—commonly used in national health surveys (NHANES, BRFSS) to assess population health status

Ecological Studies

  • Unit of analysis is populations, not individuals—compares disease rates across countries, regions, or time periods using aggregate data
  • Generates hypotheses about environmental or policy factors—useful for identifying patterns that warrant individual-level investigation
  • Ecological fallacy is the critical limitation—associations observed at group level may not hold for individuals within those groups

Case Series and Case Reports

  • Detailed documentation of individual cases—often the first signal of emerging diseases, adverse drug reactions, or unusual clinical presentations
  • No comparison group—purely descriptive, providing clinical detail but no measure of association or risk
  • Hypothesis-generating only—essential for identifying new conditions (early AIDS cases, vaping-related lung injury) but requires analytic studies to confirm patterns

Compare: Cross-sectional vs. Ecological studies—both are descriptive, but cross-sectional collects individual-level data while ecological uses population-level data. Cross-sectional can identify individual associations; ecological cannot make individual-level inferences due to ecological fallacy.


Longitudinal Designs: Tracking Change Over Time

These designs follow participants over extended periods to capture temporal relationships and disease progression. The defining feature is repeated observation of the same individuals.

Longitudinal Studies

  • Repeated measurements on same individuals—captures within-person change, disease natural history, and long-term exposure effects
  • Can be observational or experimental—the term describes the temporal structure, not the level of researcher control
  • Attrition threatens validity—loss to follow-up can introduce bias if dropouts differ systematically from those who remain

Compare: Longitudinal vs. Cross-sectional—longitudinal follows individuals over time and can establish temporal sequence; cross-sectional captures one moment and cannot. If an exam scenario asks about tracking disease progression or determining whether exposure precedes outcome, longitudinal is required.


Evidence Synthesis: Combining Studies

These methods don't generate new data but systematically aggregate existing evidence. They sit at the top of the evidence hierarchy when done properly.

Systematic Reviews and Meta-Analyses

  • Systematic reviews use structured protocols—predefined search strategies, inclusion criteria, and quality assessment minimize selection bias in evidence synthesis
  • Meta-analyses pool data statistically—calculate overall effect sizes with increased precision and power beyond any single study
  • Publication bias threatens validity—studies with positive results are more likely to be published, potentially skewing pooled estimates

Quick Reference Table

ConceptBest Examples
Strongest causal evidenceRCTs, Cohort studies
Efficient for rare diseasesCase-control, Nested case-control
Prevalence estimationCross-sectional studies
Hypothesis generationEcological studies, Case series
Policy/intervention evaluationQuasi-experimental, RCTs
Long-term exposure effectsLongitudinal, Cohort studies
Evidence synthesisMeta-analyses, Systematic reviews
First signal of emerging diseasesCase reports, Case series

Self-Check Questions

  1. A researcher wants to study risk factors for a rare childhood cancer. Which study design is most efficient, and what measure of association would be calculated?

  2. Compare cohort and case-control studies: What can cohort studies calculate that case-control studies cannot, and why?

  3. An FRQ describes a study comparing heart disease rates between countries with different dietary fat consumption. What type of study is this, and what major limitation threatens its conclusions?

  4. Why are RCTs considered the gold standard for causal inference, yet inappropriate for studying whether smoking causes lung cancer?

  5. A nested case-control study is conducted within an ongoing cohort. What advantages does this hybrid design offer over a traditional case-control study conducted in the general population?