Geometric distributions model the number of failures before a first success in repeated independent trials. They come up whenever you're asking "how many attempts until something happens?" Whether that's flipping a coin until you get heads, rolling a die until you land a six, or testing products until one fails inspection, the geometric distribution gives you the framework to calculate probabilities, expected values, and variability for these situations.

more resources to help you study

practice questions

Characteristics of Geometric Experiments

A geometric experiment is built from Bernoulli trials, which are independent trials that each have exactly two outcomes: success or failure. What makes it geometric is that you keep running trials until you get your first success, then stop.

Four conditions must hold for a geometric distribution to apply:

Independence: The outcome of one trial doesn't affect any other. A coin doesn't "remember" its last flip.
Constant probability: The probability of success $p$ (and failure $1 - p$ ) stays the same on every trial. For a fair coin, $p = 0.5$ every time. For rolling a six on a fair die, $p = 1/6$ every time.
Two outcomes per trial: Each trial results in either success or failure.
The experiment ends at the first success: You stop as soon as success occurs.

The geometric distribution also has the memoryless property. This means that no matter how many failures you've already observed, the probability of success on the very next trial is still $p$ . If you've flipped 10 tails in a row, the probability of heads on flip 11 is still 0.5. Past failures give you no information about when the next success will come.

Characteristics of geometric experiments, Bernoulli equation - WikiLectures

Probability Calculation in Geometric Distributions

The probability mass function (PMF) tells you the probability of observing exactly $x$ failures before the first success:

$P(X = x) = (1 - p)^x \cdot p$

where $X$ is the random variable counting failures before the first success, $x$ is a specific number of failures (0, 1, 2, ...), and $p$ is the probability of success on each trial.

How to use the PMF:

Identify $p$ (the probability of success) and $x$ (the number of failures you're interested in).
Plug into the formula.
Compute.

Example: What's the probability of getting exactly 3 tails before your first heads on a fair coin?

Here $p = 0.5$ and $x = 3$ :

$P(X = 3) = (1 - 0.5)^3 \cdot 0.5 = (0.5)^3 \cdot 0.5 = 0.0625$

There's a 6.25% chance of exactly 3 tails before the first heads.

Watch the definition of X. Some textbooks define $X$ as the number of trials until the first success (so $X$ includes the success trial itself), which shifts the formula to $P(X = x) = (1 - p)^{x-1} \cdot p$ . Check which version your course uses. This guide uses $X$ = number of failures before the first success.

The cumulative distribution function (CDF) gives the probability of observing at most $x$ failures before the first success:

$P(X \leq x) = 1 - (1 - p)^{x+1}$

This is useful when you need probabilities for ranges rather than a single value. For instance, the probability of getting your first heads within the first 4 flips (i.e., 0, 1, 2, or 3 failures) on a fair coin is $1 - (0.5)^4 = 0.9375$ .

Characteristics of geometric experiments, The Exponential Distribution | Introduction to Statistics

Analysis of Geometric Probability Distributions

The mean (expected value) tells you the average number of failures you'd expect before the first success:

$E(X) = \frac{1 - p}{p}$

The variance measures how spread out the number of failures tends to be:

$\text{Var}(X) = \frac{1 - p}{p^2}$

The standard deviation is the square root of the variance:

$SD(X) = \sqrt{\frac{1 - p}{p^2}}$

Example: For a fair coin ( $p = 0.5$ ):

$E(X) = \frac{1 - 0.5}{0.5} = 1$ failure on average before the first heads
$\text{Var}(X) = \frac{0.5}{0.25} = 2$
$SD(X) = \sqrt{2} \approx 1.41$

So on average you'd expect 1 tail before your first heads, with a standard deviation of about 1.41.

Example with a rarer event: For rolling a six on a fair die ( $p = 1/6$ ):

$E(X) = \frac{5/6}{1/6} = 5$ failures on average before the first six
$\text{Var}(X) = \frac{5/6}{1/36} = 30$
$SD(X) = \sqrt{30} \approx 5.48$

Notice how a smaller $p$ leads to both a higher expected number of failures and much more variability. When success is rare, the distribution spreads out considerably.

The negative binomial distribution generalizes the geometric distribution. Instead of counting failures before the first success, it counts failures before the $r$ -th success. The geometric distribution is the special case where $r = 1$ .

The probabilities in a geometric distribution form a geometric sequence (each term is a constant multiple of the previous one), which is where the distribution gets its name. This connection to geometric series is also why the probabilities sum to 1: the infinite sum $p + p(1-p) + p(1-p)^2 + \cdots$ converges to 1 for any $0 < p \leq 1$ .