Hypergeometric Distribution
The hypergeometric distribution models the probability of drawing a specific number of "successes" from a population when you're sampling without replacement. It matters because many real-world sampling situations (quality control inspections, card games, ecological surveys) don't let you put items back after selecting them, so the probability shifts with every draw.

Characteristics of Hypergeometric Experiments
A hypergeometric experiment has a specific structure that distinguishes it from other probability setups:
- Fixed sample size — You decide in advance how many items you'll draw ().
- Finite population split into two groups — The population of size contains items of interest ("successes") and other items ("failures"). For example, a jar holds 8 red marbles and 12 blue marbles.
- Sampling without replacement — Once an item is drawn, it's gone. This is the defining feature.
Because items aren't replaced, trials are not independent. Drawing a red marble from that jar means fewer red marbles remain, which changes the probability of drawing another red marble on the next trial. After every draw, both the group sizes and the overall population shrink by one.
This contrasts sharply with binomial experiments, where the probability of success stays constant from trial to trial.
Hypergeometric Distribution Calculations

The Formula
where:
- = total population size
- = number of success items in the population
- = number of items drawn (sample size)
- = number of successes you want in your sample
What the Formula Is Doing
The formula counts favorable outcomes over total outcomes, just like classical probability, but uses combinations to do the counting.
- Numerator: counts the ways to choose successes from the available, and counts the ways to choose the remaining items from the non-success group. Multiply these together to get the number of favorable samples.
- Denominator: counts the total number of ways to choose any items from the population.

Worked Example
Suppose a jar contains 8 red marbles and 12 blue marbles (, ). You draw 5 marbles without replacement (). What's the probability of getting exactly 2 red marbles?
- Identify the values: , , , .
- Count ways to choose 2 red from 8: .
- Count ways to choose 3 blue from 12: .
- Count total ways to choose 5 from 20: .
- Compute: .
So there's roughly a 39.7% chance of drawing exactly 2 red marbles.
To find the probability of a range of outcomes (say, 2 to 4 red marbles), calculate , , and separately, then add them.
Hypergeometric vs. Binomial Distributions
These two distributions look similar on the surface, but the sampling method creates a key difference.
| Feature | Binomial | Hypergeometric |
|---|---|---|
| Sampling | With replacement | Without replacement |
| Trials | Independent | Dependent |
| Constant across trials | Changes after each draw | |
| Population | Can be infinite | Must be finite |
When does it matter? If the population is large relative to the sample size, removing one item barely changes the probabilities, so the binomial serves as a good approximation. A common rule of thumb: if the sample is less than 5–10% of the population, the binomial approximation works well. But when the sample is a substantial fraction of the population (drawing 5 cards from a 52-card deck, for instance), you need the hypergeometric distribution for accurate results.
Additional Concepts
- Conditional probability is built into every hypergeometric calculation. Each draw's probability depends on what was drawn before, even though the formula handles this for you through the combinatorics.
- The hypergeometric distribution is discrete — can only take whole-number values (you can't draw 2.5 red marbles).
- The mean of a hypergeometric random variable is , which matches the intuitive "expected proportion" of successes in your sample.
- When a population contains more than two groups, the model extends to the multivariate hypergeometric distribution, though that's beyond the scope of most honors-level courses.