Discrete distributions like Bernoulli, binomial, and Poisson are foundational tools in actuarial math. They model events with countable outcomes, such as whether an insurance claim occurs, how many claims arise from a portfolio, or how often rare events happen in a given time window. Mastering these three distributions and their interrelationships is essential for probability exams and for real actuarial work in risk modeling.
Bernoulli distribution
The Bernoulli distribution models a single trial with exactly two outcomes: success () or failure (). Think of it as the simplest possible random experiment. In actuarial contexts, this could be whether a single policyholder files a claim or not.
It also serves as the building block for the binomial distribution, since a binomial random variable is just a sum of independent Bernoulli trials.
Probability mass function
The PMF of a Bernoulli random variable is:
- is the probability of success
- (often written ) is the probability of failure
When , this simplifies to . When , it gives . The compact formula just combines both cases into one expression.
Mean and variance
- Mean:
- Variance:
- Standard deviation:
Notice the variance is maximized when and equals zero when or . This makes intuitive sense: there's no uncertainty if the outcome is guaranteed.
Applications of Bernoulli distribution
- Whether a single policyholder files a claim (claim/no claim)
- Whether a manufactured item passes inspection (defective/non-defective)
- Whether a single medical treatment succeeds or fails
- Any binary outcome that feeds into a larger binomial model
Binomial distribution
The binomial distribution counts the number of successes in independent Bernoulli trials, each with the same success probability . For example, if you have 100 policyholders each with a 3% claim probability, the total number of claims follows a binomial distribution with and .
Two conditions must hold for the binomial to apply: the trials must be independent, and must be constant across all trials.
Probability mass function
The PMF of a binomial random variable is:
- is the number of trials
- is the probability of success on each trial
- is the binomial coefficient, counting the number of ways to arrange successes among trials
The logic: is the probability of one specific sequence with successes, and accounts for all possible orderings of those successes.
Cumulative distribution function
The CDF of a binomial random variable is:
This gives the probability of observing or fewer successes. To find the probability over a range, use:
Note the , not . Since is discrete, you need to include the point itself.
Mean and variance
- Mean:
- Variance:
- Standard deviation:
These follow directly from the fact that is a sum of independent Bernoulli variables, each with mean and variance .
Moment generating function
The MGF of a binomial random variable is:
The MGF uniquely determines the distribution. You can extract the -th moment by computing the -th derivative of and evaluating at . The MGF is also useful for proving that sums of independent binomials (with the same ) remain binomial.
Properties of binomial distribution
- Additivity: If and are independent, then . The success probabilities must be equal for this to work.
- Normal approximation: As grows, the binomial approaches a normal distribution. The standard rule of thumb is that the approximation is reasonable when both and . A continuity correction (adjusting by ) improves accuracy.

Binomial approximation to hypergeometric
When sampling without replacement from a finite population, the exact distribution is hypergeometric. But if the sample size is small relative to the population size (the common guideline is ), the binomial with provides a good approximation, where is the number of "successes" in the population.
The reasoning: when the population is large enough, removing one item barely changes the composition, so sampling without replacement behaves almost like sampling with replacement.
Applications of binomial distribution
- Number of claims filed out of policies in a portfolio
- Number of defective items in a batch of products
- Number of successful treatments out of patients
- Number of wins in a fixed-length series of games
Poisson distribution
The Poisson distribution models the count of events occurring in a fixed interval of time or space, given a known average rate. Unlike the binomial, there's no fixed number of trials; the support is all non-negative integers . It's characterized by a single parameter , the average number of events per interval.
In actuarial work, the Poisson is the go-to distribution for claim frequency modeling, especially when individual claim probabilities are small but the exposure is large.
Probability mass function
The PMF of a Poisson random variable is:
- is the average number of events per interval
- is Euler's number
For example, if an insurer expects claims per month, the probability of exactly 5 claims is:
Cumulative distribution function
The CDF of a Poisson random variable is:
As with the binomial, for a range: .
Mean and variance
- Mean:
- Variance:
- Standard deviation:
The fact that the mean equals the variance is a defining characteristic of the Poisson distribution. In practice, if you observe data where the sample variance is much larger or smaller than the sample mean, a Poisson model may not be appropriate. This is called overdispersion (variance > mean) or underdispersion (variance < mean).
Moment generating function
The MGF of a Poisson random variable is:
As with the binomial, the -th moment is found by taking the -th derivative of at . The MGF also makes it straightforward to prove the additivity property below.
Properties of Poisson distribution
- Additivity: If and are independent, then . You can verify this by multiplying the MGFs.
- Limiting case of binomial: The Poisson distribution arises as the limit of when , , and stays constant. This is why the Poisson works well for modeling many independent rare events.
Poisson approximation to binomial
When is large and is small, computing binomial probabilities directly can be cumbersome. The Poisson with provides a convenient approximation.
Common rules of thumb for when the approximation is adequate:
- and , or
- and
Example: Suppose 1,000 policies each have a 0.2% chance of a catastrophic claim. The exact distribution is , but you can approximate it with since .

Applications of Poisson distribution
- Number of insurance claims arriving per month or per year
- Number of accidents at an intersection over a given period
- Number of natural disasters in a region per decade
- Number of customer arrivals at a service counter per hour
Relationships between distributions
Binomial as sum of Bernoullis
If are independent Bernoulli random variables each with success probability , then:
This is the formal connection: the binomial distribution is a generalization of the Bernoulli to multiple trials. It also explains why the binomial mean is (sum of means of ) and the variance is (sum of variances, using independence).
Poisson as limit of binomial
If , then as :
This is the Poisson limit theorem. The proof involves substituting into the binomial PMF and taking the limit term by term. The key insight is that many independent trials, each with a tiny success probability, produce a count that's well-described by the Poisson.
Fitting discrete distributions
Method of moments
The method of moments estimates parameters by setting sample moments equal to theoretical moments and solving.
Steps:
- Compute the sample moments from your data (sample mean , sample variance , etc.)
- Write out the theoretical moments as functions of the unknown parameter(s)
- Set sample moments equal to theoretical moments
- Solve for the parameter(s)
Bernoulli/Binomial example: For a Bernoulli distribution, , so set , giving . For a Poisson, , so .
The method of moments is simple and intuitive, though it doesn't always produce the most efficient estimators.
Maximum likelihood estimation
MLE finds the parameter value that makes the observed data most probable.
Steps:
- Write the likelihood function as the joint PMF of the observed data, treated as a function of :
- Take the natural log to get the log-likelihood
- Differentiate with respect to and set equal to zero
- Solve for
- Verify it's a maximum (second derivative test)
MLEs have strong asymptotic properties: they are consistent (converge to the true value), asymptotically normal, and asymptotically efficient (achieve the lowest possible variance among estimators). For the Poisson, the MLE of turns out to be , which happens to coincide with the method of moments estimator.
Discrete distribution examples
Modeling claim frequency
The Poisson distribution is the standard model for claim frequency in actuarial science. If an insurer observes an average of claims per week from historical data, you can use to calculate the probability of any specific claim count in a future week.
The parameter is typically estimated from historical data using either method of moments () or MLE. Once estimated, the model supports tasks like setting reserves, pricing premiums, and stress-testing under high-claim scenarios.
Modeling rare events
The Poisson distribution is particularly well-suited for rare events because it naturally arises from many exposures each with a small probability. Examples include:
- Earthquakes in a region per year ( might be 0.3)
- Industrial accidents at a factory per quarter
- Catastrophic insurance losses per decade
The small value concentrates most of the probability mass near zero, which matches the observed behavior of rare events.
Modeling success/failure experiments
The binomial distribution fits scenarios with a fixed number of independent trials, each having the same probability of success. For instance, if a batch of 50 items each has a 4% defect rate, the number of defectives follows .
From this you can calculate quantities like:
- : probability the entire batch is defect-free
- : probability of 5 or more defectives (useful for quality control thresholds)
- : the expected number of defectives