Binomial distributions model experiments where you run a fixed number of trials and each trial has exactly two possible outcomes. They show up constantly in probability problems: coin flips, defective products on an assembly line, free-throw shooting percentages, survey responses. Understanding this distribution gives you a reliable framework for calculating how likely a particular number of "successes" is across repeated trials.

more resources to help you study

practice questions

Binomial Distribution

Characteristics of binomial experiments

A binomial experiment has four specific conditions that must all be met. If even one is missing, you can't use the binomial model.

Fixed number of trials ( $n$ ): The number of trials is determined before the experiment begins and doesn't change. You flip a coin 20 times, not "until you get bored."
Two outcomes per trial: Each trial results in either a success ( $S$ ) or a failure ( $F$ ). These labels are flexible: "success" just means the outcome you're tracking, even if it's something undesirable like a defective part.
Constant probability of success ( $p$ ): The probability $p$ stays the same from one trial to the next. If $p = 0.3$ on the first trial, it's $p = 0.3$ on every trial.
Independence between trials: The outcome of one trial doesn't affect the probability of success on any other trial. This is naturally satisfied with replacement sampling, or when the population is large enough relative to the sample that removing one item barely changes the probabilities (a common rule of thumb is the population should be at least 10 times the sample size).

A series of independent trials with a fixed probability of success is called Bernoulli trials. The binomial distribution counts the total number of successes across $n$ Bernoulli trials.

Characteristics of binomial experiments, Binomial distribution - Wikipedia

Formulas for binomial distribution statistics

Mean:

$\mu = n \times p$

This gives the expected number of successes. If you flip a fair coin 50 times ( $n = 50$ , $p = 0.5$ ), you'd expect $\mu = 50 \times 0.5 = 25$ heads.

Variance:

$\sigma^2 = n \times p \times (1 - p)$

Variance measures how spread out the distribution of successes is around the mean. Notice that $(1 - p)$ is just the probability of failure, sometimes written as $q$ .

Standard deviation:

$\sigma = \sqrt{n \times p \times (1 - p)}$

This is the square root of the variance. It tells you, roughly, how far the actual number of successes typically falls from the expected value, in the same units as the count of successes.

Interpretation of binomial distribution measures

Mean ( $\mu$ ): This is the long-run average number of successes you'd see if you repeated the entire experiment many times. If $n = 100$ and $p = 0.4$ , then $\mu = 100 \times 0.4 = 40$ . That doesn't mean you'll get exactly 40 successes every time, but 40 is the center of the distribution.

Standard deviation ( $\sigma$ ): This tells you how much the actual count of successes tends to vary from the mean.

Using the same example ( $n = 100$ , $p = 0.4$ ): $\sigma = \sqrt{100 \times 0.4 \times 0.6} = \sqrt{24} \approx 4.90$ . So in a typical run, you'd expect the number of successes to land within roughly 5 of the mean.
A smaller $\sigma$ means results cluster tightly around $\mu$ . A larger $\sigma$ means more spread.
$\sigma$ is largest when $p = 0.5$ (maximum uncertainty per trial) and shrinks as $p$ moves toward 0 or 1.

Probability functions and theorems

Probability mass function (PMF): This gives the probability of getting exactly $k$ successes in $n$ trials:

$P(X = k) = \binom{n}{k} \times p^k \times (1 - p)^{n - k}$

$\binom{n}{k}$ is the binomial coefficient (" $n$ choose $k$ "), which counts the number of ways to arrange $k$ successes among $n$ trials.
$p^k$ is the probability of $k$ successes occurring.
$(1 - p)^{n-k}$ is the probability of the remaining trials being failures.

For example, the probability of getting exactly 3 heads in 5 fair coin flips: $P(X = 3) = \binom{5}{3} \times 0.5^3 \times 0.5^2 = 10 \times 0.125 \times 0.25 = 0.3125$ .

Cumulative distribution function (CDF): This gives the probability of getting $k$ or fewer successes: $P(X \leq k)$ . You calculate it by summing the PMF values from 0 through $k$ . Most calculators and tables handle this directly.

Connection to other theorems:

The law of large numbers says that as you repeat the experiment more and more times, the observed proportion of successes converges toward $p$ .
The central limit theorem tells you that when $n$ is large enough (commonly when both $np \geq 10$ and $n(1-p) \geq 10$ ), the binomial distribution is well-approximated by a normal distribution with mean $\mu = np$ and standard deviation $\sigma = \sqrt{np(1-p)}$ . This is useful because normal distribution calculations are often easier to perform.