🐛Biostatistics

Common Probability Distributions

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Probability distributions are the foundation of everything you'll do in biostatistics—from designing clinical trials to interpreting p-values to modeling disease outbreaks. When you're tested on this material, you're not just being asked to recall formulas. You're being evaluated on whether you understand when to apply each distribution, what assumptions it requires, and how distributions relate to each other. The exam will present scenarios and expect you to identify the appropriate distribution based on the data type and underlying process.

Think of distributions as tools in a toolkit: the normal distribution handles continuous measurements, the binomial counts successes in fixed trials, and the Poisson tracks rare events over time. Each distribution encodes specific assumptions about how data behave—independence, fixed trials, constant rates, symmetry. Don't just memorize parameters—know what real-world process each distribution models and when one distribution approximates another.

Continuous Distributions for Measurement Data

These distributions model variables that can take any value within a range. They're essential for analyzing measurements like blood pressure, weight, reaction times, and biomarker concentrations.

Normal (Gaussian) Distribution

Bell-shaped and symmetric around the mean ( $\mu$ )—the most important distribution in statistics because of its mathematical properties and real-world prevalence
Empirical Rule (68-95-99.7): approximately 68% of data falls within $\pm 1\sigma$ , 95% within $\pm 2\sigma$ , and 99.7% within $\pm 3\sigma$ of the mean
Central Limit Theorem guarantees that sample means approach normality as $n$ increases, even when the underlying population isn't normal

Student's t-Distribution

Heavier tails than the normal distribution—accounts for extra uncertainty when estimating population parameters from small samples
Degrees of freedom (df) control the shape; as $df \to \infty$ , the t-distribution converges to the standard normal
Primary use: hypothesis tests and confidence intervals for means when $\sigma$ is unknown and $n < 30$

Uniform Distribution

All outcomes equally likely between minimum ( $a$ ) and maximum ( $b$ ) values—the "no information" distribution
Mean is $(a + b)/2$ and variance is $(b - a)^2/12$ for the continuous case
Simulation workhorse: random number generators produce uniform variates that are transformed into other distributions

Compare: Normal vs. t-distribution—both are symmetric and bell-shaped, but the t-distribution has heavier tails to account for uncertainty in small samples. If an FRQ asks about confidence intervals with unknown population standard deviation, reach for the t-distribution.

Discrete Distributions for Counting Events

These distributions model outcomes you can count—number of successes, number of occurrences, binary outcomes. They're fundamental for categorical data analysis and clinical trial design.

Bernoulli Distribution

Single trial with two outcomes: success (1) with probability $p$ or failure (0) with probability $1-p$
Mean is $p$ and variance is $p(1-p)$ —maximum variance occurs when $p = 0.5$
Building block for the binomial distribution; understanding Bernoulli trials is essential for grasping more complex counting distributions

Binomial Distribution

Counts successes in $n$ independent Bernoulli trials—think drug response rates, disease prevalence in samples, or treatment outcomes
Parameters: $n$ (number of trials) and $p$ (probability of success); mean is $np$ , variance is $np(1-p)$
Normal approximation works well when $np \geq 10$ and $n(1-p) \geq 10$ , making large-sample calculations tractable

Poisson Distribution

Models count of rare events in a fixed interval of time or space, given average rate $\lambda$
Key property: mean and variance are both equal to $\lambda$ —if observed variance greatly exceeds the mean, Poisson assumptions may be violated
Applications: hospital admissions per day, mutations per genome region, adverse events in clinical trials

Compare: Binomial vs. Poisson—both count discrete events, but binomial requires a fixed number of trials while Poisson models events in continuous time/space. Poisson approximates binomial when $n$ is large and $p$ is small ( $\lambda = np$ ). Use this approximation for rare disease incidence.

Distributions for Time-to-Event and Waiting Times

These continuous distributions model how long until something happens—critical for survival analysis, reliability studies, and pharmacokinetics.

Exponential Distribution

Models time between events in a Poisson process, with rate parameter $\lambda$ (or equivalently, mean $1/\lambda$ )
Memoryless property: $P(T > s + t \mid T > s) = P(T > t)$ —the system doesn't "age," making future predictions independent of elapsed time
Survival analysis foundation: models time to death, equipment failure, or disease recurrence when hazard rate is constant

Gamma Distribution

Generalizes the exponential—models the waiting time for $k$ events in a Poisson process (shape parameter $k$ , scale parameter $\theta$ )
Flexible shape: can be right-skewed (small $k$ ) or nearly symmetric (large $k$ ); when $k = 1$ , reduces to exponential
Applications: total hospital length of stay, aggregate waiting times, and as a prior distribution in Bayesian analysis

Compare: Exponential vs. Gamma—exponential models time to one event; gamma models time to multiple events. If a problem asks about time until the third patient arrives, you need the gamma distribution with $k = 3$ .

Distributions for Proportions and Hypothesis Testing

These distributions arise in specific statistical procedures—testing hypotheses, estimating proportions, and Bayesian inference.

Chi-Square Distribution

Sum of squared standard normal variables—arises naturally when estimating variance or testing categorical data
Degrees of freedom (df) determine shape; distribution is right-skewed but approaches normality as $df$ increases
Primary uses: goodness-of-fit tests, tests of independence in contingency tables, and confidence intervals for variance

Beta Distribution

Defined on the interval [0, 1]—perfect for modeling probabilities, proportions, and rates
Shape parameters $\alpha$ and $\beta$ control the distribution: symmetric when $\alpha = \beta$ , skewed otherwise; uniform when $\alpha = \beta = 1$
Bayesian workhorse: serves as the conjugate prior for binomial data, making posterior calculations elegant

Compare: Chi-square vs. t-distribution—both depend on degrees of freedom and are used in hypothesis testing, but chi-square tests categorical relationships and variance while t-tests compare means. Know which test statistic follows which distribution.

Quick Reference Table

Concept	Best Examples
Continuous measurements (symmetric)	Normal, t-distribution
Counting successes in fixed trials	Bernoulli, Binomial
Rare events in time/space	Poisson
Time until event occurs	Exponential, Gamma
Modeling proportions [0, 1]	Beta, Uniform
Hypothesis testing (categorical)	Chi-square
Small-sample inference	t-distribution
Bayesian priors	Beta, Gamma

Self-Check Questions

A researcher is counting the number of patients who respond to a new drug out of 50 treated. Which distribution models this outcome, and what parameters define it?
Compare and contrast the Poisson and exponential distributions. How are they mathematically related, and when would you use each?
Why does the t-distribution have heavier tails than the normal distribution? Under what conditions do they become equivalent?
You're modeling the proportion of time a patient spends in remission (a value between 0 and 1). Which distribution is most appropriate, and why?
An FRQ presents hospital emergency room data showing the mean number of arrivals per hour equals 4, but the variance equals 12. Should you use a Poisson model? Explain your reasoning using the distribution's key property.

🐛Biostatistics

Common Probability Distributions

Why This Matters

Continuous Distributions for Measurement Data

Normal (Gaussian) Distribution

Student's t-Distribution

Uniform Distribution

Discrete Distributions for Counting Events

Bernoulli Distribution

Binomial Distribution

Poisson Distribution

Distributions for Time-to-Event and Waiting Times

Exponential Distribution

Gamma Distribution

Distributions for Proportions and Hypothesis Testing

Chi-Square Distribution

Beta Distribution

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes