๐Ÿฆ Epidemiology

Measures of Association in Epidemiology

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

When you're analyzing epidemiological data, you need more than raw numbers. You need tools that reveal whether an exposure actually matters and how much it matters. Measures of association are the mathematical backbone of every study you'll encounter, from cohort studies to clinical trials. You're being tested on your ability to select the right measure for a given study design, interpret what the values mean, and explain the public health implications of your findings.

These measures fall into two fundamental categories: relative measures (ratios that compare groups) and absolute measures (differences that quantify actual impact). This distinction is crucial because a large relative risk might have minimal public health significance if the baseline risk is tiny, while a modest risk ratio could represent thousands of preventable cases. Don't just memorize formulas. Know when each measure is appropriate, what study designs require which measures, and how to translate numbers into actionable public health insights.


Relative Measures: Comparing Groups with Ratios

Relative measures express association as a ratio, telling you how many times more (or less) likely an outcome is in one group compared to another. They're intuitive for communicating the strength of an association but can overstate importance when baseline risks are low.

Risk Ratio (Relative Risk)

The risk ratio is the gold standard for cohort studies. It directly compares the probability of disease in exposed versus unexposed groups.

Formula: RR=Incidenceย inย exposedIncidenceย inย unexposedRR = \frac{\text{Incidence in exposed}}{\text{Incidence in unexposed}}

This requires knowing incidence in both groups, which means you need to follow people forward over time (prospective data).

Interpretation:

  • RR>1RR > 1: the exposure is associated with increased risk
  • RR<1RR < 1: the exposure is associated with decreased risk (protective)
  • RR=1RR = 1: no association between exposure and outcome

For example, if smokers develop lung cancer at a rate of 20 per 1,000 and non-smokers at 1 per 1,000, the RR=20RR = 20. Smokers are 20 times more likely to develop lung cancer.

Odds Ratio

The odds ratio is essential for case-control studies. It's the only valid ratio measure when you're sampling based on outcome (cases vs. controls) rather than following exposed and unexposed groups forward.

Formula from a 2ร—2 table: OR=aร—dbร—cOR = \frac{a \times d}{b \times c}

where aa = exposed cases, bb = exposed non-cases, cc = unexposed cases, dd = unexposed non-cases.

The OR approximates the risk ratio when the outcome is rare (roughly <10% incidence). This is called the rare disease assumption. When the disease is common, the OR will exaggerate the association compared to the RR.

Rate Ratio

The rate ratio accounts for variable follow-up time. Instead of comparing cumulative incidence (proportion who get sick), it compares incidence rates (events per person-time).

This is preferred in cohort studies where participants have unequal observation periods. Someone who drops out at year 2 of a 5-year study contributes only 2 person-years, not 5. The rate ratio handles this naturally.

Interpretation mirrors the risk ratio: a rate ratio > 1 means a higher event rate in the exposed group, and a rate ratio < 1 means a lower rate.

Hazard Ratio

The hazard ratio is the measure of choice for survival analysis. It compares the instantaneous risk (hazard) of an event at any given time point between groups.

It's derived from Cox proportional hazards models, which assume the ratio of hazards between groups remains constant over time. This is called the proportional hazards assumption, and if it's violated, the HR can be misleading.

A HR=2.0HR = 2.0 means the exposed group experiences the event at twice the rate of the unexposed group at any given moment during follow-up. You'll see this frequently in clinical trials reporting time-to-event outcomes like death or disease recurrence.

Compare: Risk Ratio vs. Odds Ratio: both express relative association, but RR requires prospective data (cohort studies) while OR works with case-control designs. On exams, if you see "case-control study," the answer is almost always odds ratio.


Absolute Measures: Quantifying Real-World Impact

Absolute measures express association as a difference, answering the question: how many additional cases (or prevented cases) result from the exposure? These are critical for public health planning because they reflect actual disease burden rather than just the ratio between groups.

Risk Difference

The risk difference is the simplest absolute measure. It subtracts the risk in the unexposed from the risk in the exposed:

RD=Rexposedโˆ’RunexposedRD = R_{exposed} - R_{unexposed}

It's directly interpretable: a risk difference of 0.05 means 5 additional cases per 100 people exposed. A positive value means excess risk from exposure; a negative value indicates a protective effect.

Incidence Rate Difference

This is the absolute difference between incidence rates (per person-time) rather than cumulative risks. It quantifies excess events per unit of person-time and is useful when comparing populations with different follow-up durations.

The incidence rate difference complements the rate ratio. The rate ratio tells you how many times greater the rate is; the incidence rate difference tells you how many more events actually occur per person-time unit.

Compare: Risk Difference vs. Risk Ratio: a risk ratio of 2.0 sounds dramatic, but if baseline risk is 1 in 10,000, the risk difference is only 0.0001 (1 extra case per 10,000 people). Exam questions often ask you to explain why both measures matter for interpreting study results. The ratio tells you strength of association; the difference tells you public health impact.


Attributable Measures: Assigning Responsibility to Exposures

Attributable measures answer a causal question: how much of the disease can we attribute to this exposure? They're essential for prioritizing interventions and estimating the potential impact of prevention programs.

Attributable Risk

Also called attributable risk percent or attributable fraction among the exposed, this tells you what proportion of disease in the exposed group is due to the exposure itself.

Formula: AR%=RRโˆ’1RRร—100AR\% = \frac{RR - 1}{RR} \times 100

If RR=5RR = 5, then AR%=5โˆ’15ร—100=80%AR\% = \frac{5-1}{5} \times 100 = 80\%. That means 80% of cases among the exposed would theoretically disappear if the exposure were eliminated.

This measure assumes a causal relationship. It's only meaningful if the association is truly causal, not just a statistical correlation driven by confounding.

Population Attributable Risk

Population attributable risk (PAR) extends the concept to the entire population, accounting for how common the exposure is, not just how dangerous it is.

Formula: PAR%=Pe(RRโˆ’1)1+Pe(RRโˆ’1)ร—100PAR\% = \frac{P_e(RR - 1)}{1 + P_e(RR - 1)} \times 100

where PeP_e is the prevalence of exposure in the population.

This is where public health priorities get interesting. A moderately risky but very common exposure (like physical inactivity) may cause more total disease in the population than a highly risky but rare exposure (like occupational asbestos exposure). PAR captures that reality.

Compare: Attributable Risk vs. Population Attributable Risk: AR tells you the impact among the exposed; PAR tells you the impact in the whole population. If a question asks about "public health significance" or "population-level impact," PAR is your answer.


Clinical Application Measures: Translating Research to Practice

These measures bridge the gap between statistical findings and clinical decision-making, expressing results in terms that directly inform treatment choices.

Number Needed to Treat (NNT)

NNT is the clinical translation of risk difference:

NNT=1Absoluteย Riskย ReductionNNT = \frac{1}{\text{Absolute Risk Reduction}}

where Absolute Risk Reduction (ARR) is the risk difference between the control group and the treatment group.

Lower NNT is better. An NNT of 10 means you need to treat 10 patients to prevent one adverse event. An NNT of 100 means the intervention is far less efficient.

Interpretation is always context-dependent. An NNT of 50 might be perfectly acceptable for preventing death but unacceptable for preventing a mild headache. The severity of the outcome and the cost/risk of treatment both matter.

Correlation Coefficient

The correlation coefficient measures the strength and direction of a linear association between two continuous variables. It ranges from โˆ’1-1 (perfect negative correlation) to +1+1 (perfect positive correlation), with r=0r = 0 indicating no linear relationship.

The most common version is the Pearson correlation coefficient (rr). A key limitation: r=0r = 0 only rules out a linear relationship. A strong nonlinear association (like a U-shaped curve) could still exist.

Correlation does not imply causation. Ecological studies that rely on correlation are particularly vulnerable to confounding and the ecological fallacy (assuming group-level associations apply to individuals).

Compare: NNT vs. Risk Difference: they contain the same information expressed differently. Risk difference is how researchers report results; NNT is how clinicians explain treatment benefits to patients. If the ARR is 0.05, the NNT is 10.05=20\frac{1}{0.05} = 20.


Quick Reference Table

ConceptBest Examples
Relative measures (ratios)Risk Ratio, Odds Ratio, Rate Ratio, Hazard Ratio
Absolute measures (differences)Risk Difference, Incidence Rate Difference
Attributable measuresAttributable Risk, Population Attributable Risk
Case-control study measuresOdds Ratio
Survival analysis measuresHazard Ratio
Clinical decision measuresNumber Needed to Treat
Continuous variable associationCorrelation Coefficient
Public health impact measuresPopulation Attributable Risk, Risk Difference

Self-Check Questions

  1. A case-control study examines the association between smoking and lung cancer. Which measure of association is appropriate, and why can't you use risk ratio in this design?

  2. Compare and contrast attributable risk and population attributable risk. If a rare exposure has a very high risk ratio, which measure would show greater public health significance, and why?

  3. Two interventions both have a risk ratio of 0.5 for preventing heart attacks. Intervention A has an NNT of 20; Intervention B has an NNT of 200. Explain why these NNTs differ despite identical risk ratios.

  4. When would you choose rate ratio over risk ratio? Identify the key study characteristic that makes person-time analysis necessary.

  5. A correlation coefficient of r=0.85r = 0.85 is found between ice cream sales and drowning deaths. Explain why this finding does not support a causal relationship and identify the likely explanation for the association.