The geometric distribution models how many trials you need before getting your first success in a series of independent experiments. Whether it's coin flips, product inspections, or job interviews, any time you're asking "how many attempts until it finally works?", the geometric distribution is your tool.

Geometric Distribution Probability Calculations

The geometric distribution applies when you have Bernoulli trials: repeated independent trials where each trial has only two outcomes (success or failure) and the probability of success $p$ stays the same every time.

The random variable $X$ represents the number of trials until the first success. The probability mass function (PMF) is:

$P(X = k) = (1 - p)^{k-1} \cdot p$

where $k = 1, 2, 3, \ldots$

The logic behind this formula: to get your first success on trial $k$ , you need $k - 1$ failures in a row (each with probability $1 - p$ ), followed by one success (probability $p$ ).

To calculate a geometric probability:

Identify the probability of success $p$ for each trial.
Determine $k$ , the specific trial number you're calculating the probability for.
Plug into the PMF and simplify.

Example: A weighted coin has a 0.4 probability of landing heads. What's the probability that the first heads occurs on the 3rd flip?

$P(X = 3) = (1 - 0.4)^{3-1} \cdot 0.4 = (0.6)^2 \cdot 0.4 = 0.36 \cdot 0.4 = 0.144$

Geometric distribution probability calculations, Category:Geometric distribution - Wikimedia Commons

Interpreting Geometric Distribution Parameters

Mean (expected value):

$E(X) = \frac{1}{p}$

This tells you the average number of trials needed to get the first success. For example, since the probability of rolling a 6 on a fair die is $\frac{1}{6}$ , you'd expect on average $\frac{1}{1/6} = 6$ rolls to get your first 6.

Notice the intuition: a lower probability of success means you'll need more trials on average.

Standard deviation:

$\sigma = \sqrt{\frac{1 - p}{p^2}}$

This measures how much variability there is in the number of trials needed. A higher standard deviation means the actual number of attempts could differ a lot from the mean.

When applying these to real-world problems, always interpret in context. If $p = 0.2$ for getting hired at each job interview, the mean is $\frac{1}{0.2} = 5$ interviews, and the standard deviation is $\sqrt{\frac{0.8}{0.04}} = \sqrt{20} \approx 4.47$ interviews. You'd say: "On average, a person needs 5 interviews to get hired, with a standard deviation of about 4.47 interviews, indicating quite a bit of variability."

Geometric distribution probability calculations, Probability a: Bernoulli trial on Vimeo

Two Cases of Geometric Distributions

Your textbook or exam may define the geometric distribution in one of two ways. Read problems carefully to figure out which version is being used.

Case 1: $X$ = number of trials until the first success (includes the success trial)
- $X$ can be 1, 2, 3, ...
- PMF: $P(X = k) = (1 - p)^{k-1} \cdot p$ , where $k = 1, 2, 3, \ldots$
- Mean: $E(X) = \frac{1}{p}$
Case 2: $Y$ = number of failures before the first success (does not include the success trial)
- $Y$ can be 0, 1, 2, ...
- PMF: $P(Y = k) = (1 - p)^{k} \cdot p$ , where $k = 0, 1, 2, \ldots$
- Mean: $E(Y) = \frac{1 - p}{p}$

The key difference: Case 1 starts at $k = 1$ with exponent $k - 1$ , while Case 2 starts at $k = 0$ with exponent $k$ . The two are related by $Y = X - 1$ . Most intro stats courses use Case 1.

The cumulative distribution function (CDF) gives the probability that the first success occurs within $k$ trials or fewer:

$P(X \leq k) = 1 - (1 - p)^k$

This is useful when a problem asks something like "what's the probability of getting at least one success in 5 tries?" rather than asking about a specific trial number.

The negative binomial distribution extends the geometric distribution to model the number of trials needed to achieve a specified number of successes (not just the first). The geometric distribution is the special case where you need exactly one success.