Probability axioms and properties form the mathematical foundation for everything in actuarial science. Before you can price insurance products, model mortality, or assess risk, you need a rigorous framework for quantifying uncertainty. This guide covers that framework, from the core axioms through conditional probability and independence.
Probability basics
Probability assigns a numerical value to how likely an event is to occur. In actuarial work, this translates directly into questions like "what's the chance a policyholder files a claim this year?" or "what's the likelihood a bond defaults?" Every model you'll build later rests on these fundamentals.
Sample space and events
The sample space () is the set of all possible outcomes of a random process. For example, rolling a standard die gives .
An event is any subset of the sample space. It can be:
- Simple: a single outcome, like rolling a 4
- Compound: multiple outcomes, like rolling an even number ()
Two special events always exist: the empty set (no outcomes, the "impossible event") and itself (all outcomes, the "certain event"). Both qualify as events in the formal framework.
Probability of an event
The probability measures the likelihood of event occurring, always as a value between 0 and 1. For a finite sample space with equally likely outcomes:
Three approaches to assigning probabilities come up in practice:
- Classical: based on equally likely outcomes (coin flips, dice)
- Empirical (frequentist): based on observed relative frequencies from data (claim rates from historical records)
- Subjective: based on expert judgment when data is limited (emerging risk assessment)
Axioms of probability
These three axioms, formalized by Kolmogorov, are the bedrock of probability theory. Every result that follows in this guide (and in the rest of the course) derives from them.
Non-negativity
for any event
Probabilities can never be negative. A negative likelihood has no meaningful interpretation.
Probability of sample space
Since contains every possible outcome, something in it must happen. The total probability is 1.
Countable additivity
For any countable sequence of mutually exclusive events (meaning no two can occur simultaneously):
This is what lets you break a complex event into simpler, non-overlapping pieces and add their probabilities. Note that this axiom applies to countably infinite unions, not just finite ones, which is what gives probability theory its analytical power.
Consequences of axioms
Everything below follows logically from the three axioms. These aren't additional assumptions; they're provable results.

Probability of empty set
The impossible event has probability zero. You can prove this by writing and applying countable additivity: , which only works if .
Probability of complement
Since and are mutually exclusive and , additivity gives . This is extremely useful in practice. When computing directly is hard, it's often easier to compute and subtract from 1. For example, "probability that at least one claim occurs" = .
Monotonicity of probability
If , then
If every outcome in is also in , then can't be more probable than . You can see this by writing , where the two pieces are disjoint, so .
Properties of probability
Inclusion-exclusion for two events
For any two events and :
Why subtract ? When you add and , outcomes in both and get counted twice. Subtracting the intersection corrects for that double-counting.
Example: Suppose 30% of policyholders have auto insurance, 50% have home insurance, and 10% have both. The probability a random policyholder has at least one of these is .
Inclusion-exclusion for multiple events
The principle generalizes to events :
The pattern alternates: add single-event probabilities, subtract pairwise intersections, add triple intersections, and so on. Each term corrects the over- or under-counting from the previous step.
Bonferroni inequalities
When exact intersection probabilities are unknown or expensive to compute, Bonferroni inequalities give you bounds:
- Upper bound:
- Lower bound:
The upper bound is also called Boole's inequality (or the union bound). These are truncations of the inclusion-exclusion formula: stopping after an odd number of terms gives an upper bound, and stopping after an even number gives a lower bound.

Continuity of probability
If is an increasing sequence of events with , then:
Similarly, for a decreasing sequence with :
This continuity property follows from countable additivity and is what allows you to use limiting arguments in probability. It's essential when working with sequences of events in survival analysis or ruin theory.
Conditional probability
Conditional probability captures how the likelihood of an event changes when you learn that another event has occurred. In actuarial contexts, this shows up constantly: what's the probability of a large claim given that a claim has been filed? What's the probability of death within a year given that the insured is age 65?
Definition and formula
The conditional probability of given is:
You're restricting your attention to the outcomes where occurred, then asking what fraction of those also belong to . The requirement is critical; conditioning on an impossible event is undefined.
Law of total probability
If is a partition of (the are mutually exclusive and their union is ), then for any event :
This lets you compute by breaking the problem into cases.
Example: An insurer has three risk classes: low (60% of policyholders, 2% claim rate), medium (30%, 5% claim rate), and high (10%, 15% claim rate). The overall claim probability is:
So 4.2% of all policyholders file a claim.
Bayes' theorem
Bayes' theorem reverses the direction of conditioning. Given the same partition :
The denominator is just from the law of total probability.
Example (continuing from above): If a policyholder filed a claim, what's the probability they're high-risk?
Even though only 10% of policyholders are high-risk, they make up about 35.7% of those who file claims. This kind of posterior updating is central to credibility theory and Bayesian methods in actuarial work.
Independence of events
Two events are independent if knowing one occurred tells you nothing about whether the other occurred. Independence simplifies calculations dramatically, but you should always verify it rather than assume it.
Definition of independence
Events and are independent if and only if:
An equivalent condition: (when ). The occurrence of doesn't change the probability of .
Be careful: independence is not the same as being mutually exclusive. If and are mutually exclusive and both have positive probability, they cannot be independent (knowing occurred tells you didn't).
Multiplication rule for independent events
For independent events :
Example: If three independent policyholders each have a 0.03 probability of filing a claim, the probability all three file is .
Pairwise vs mutual independence
This distinction matters and is a common exam topic.
- Pairwise independence: Every pair of events satisfies
- Mutual independence: Every subset of events satisfies the multiplication rule. For three events, this means all of the following must hold:
Pairwise independence does not guarantee mutual independence. There are well-known counterexamples where three events are pairwise independent but the triple intersection condition fails. For actuarial modeling, mutual independence is typically the assumption you need.