Probability measures how likely events are to occur, giving us a way to quantify uncertainty. It's the foundation of statistical analysis and shows up everywhere, from predicting weather to evaluating risk in finance. This section covers the core rules, counting methods, and probability distributions you'll need for the course.

Introduction to Probability

Probability is a number between 0 and 1 that describes how likely an event is to happen. A probability of 0 means the event is impossible, and a probability of 1 means it's certain.

The sample space is the set of all possible outcomes for a given experiment. For example, the sample space for rolling a standard die is {1, 2, 3, 4, 5, 6}.

Three axioms (fundamental rules) govern all of probability:

The probability of any event is non-negative: $P(A) \geq 0$
The probability of the entire sample space is 1: $P(S) = 1$
For mutually exclusive events (events that can't happen at the same time), the probability of one or the other occurring equals the sum of their individual probabilities

Everything else in probability builds on these three rules.

Introduction to Probability, Why It Matters: Probability and Probability Distributions | Concepts in Statistics

Addition and Multiplication Rules

These rules let you calculate probabilities for combined events. The addition rules handle "or" situations, and the multiplication rules handle "and" situations.

Addition Rule (Mutually Exclusive Events)

When two events can't happen at the same time, you simply add their probabilities:

$P(A \text{ or } B) = P(A) + P(B)$

For example, the probability of rolling a 1 or a 6 on a fair die is $\frac{1}{6} + \frac{1}{6} = \frac{2}{6} = \frac{1}{3}$ .

Addition Rule (Non-Mutually Exclusive Events)

When two events can happen at the same time, you need to subtract the overlap to avoid counting it twice:

$P(A \text{ or } B) = P(A) + P(B) - P(A \text{ and } B)$

For example, drawing a heart or a face card from a standard deck. Some cards are both hearts and face cards (the jack, queen, and king of hearts), so you subtract those three out.

Multiplication Rule (Independent Events)

When one event doesn't affect the other, multiply their probabilities:

$P(A \text{ and } B) = P(A) \times P(B)$

Flipping heads on a coin and rolling a 6 on a die are independent. Neither outcome influences the other, so $P(\text{heads and 6}) = \frac{1}{2} \times \frac{1}{6} = \frac{1}{12}$ .

Multiplication Rule (Dependent Events)

When one event changes the probability of the other, you use conditional probability:

$P(A \text{ and } B) = P(A) \times P(B|A)$

$P(B|A)$ is read as "the probability of B given A." For example, if you draw an ace from a standard deck (probability $\frac{4}{52}$ ), the probability of drawing a second ace changes to $\frac{3}{51}$ because there are now fewer aces and fewer cards.

Bayes' Theorem lets you reverse conditional probabilities. It calculates $P(A|B)$ when you know $P(B|A)$ , which is useful for updating your beliefs when you get new information. At the intro level, the key idea is that Bayes' theorem connects $P(A|B)$ and $P(B|A)$ , which are not the same thing.

Introduction to Probability, Tree diagram (probability theory) - Wikipedia

Permutations and Combinations

These are counting techniques. They help you figure out how many possible outcomes exist, which you then use to calculate probabilities.

Fundamental Counting Principle

If there are $n_1$ ways to do one thing and $n_2$ ways to do another, there are $n_1 \times n_2$ ways to do both. This extends to any number of choices. If you have 4 shirts and 3 pairs of pants, that's $4 \times 3 = 12$ possible outfits.

Permutations (Order Matters)

A permutation counts the number of ways to arrange $r$ objects chosen from $n$ distinct objects, where the order of arrangement matters.

$P(n,r) = \frac{n!}{(n-r)!}$

Arranging 3 books on a shelf from a collection of 5: $P(5,3) = \frac{5!}{2!} = 60$ different arrangements. Placing Book A first and Book B second is a different outcome than Book B first and Book A second.

Combinations (Order Doesn't Matter)

A combination counts the number of ways to select $r$ objects from $n$ objects when you don't care about the order.

$C(n,r) = \binom{n}{r} = \frac{n!}{r!(n-r)!}$

Selecting 3 pizza toppings from 6 options: $C(6,3) = \frac{6!}{3! \times 3!} = 20$ possible selections. Choosing pepperoni, mushrooms, and olives is the same selection regardless of what order you pick them.

Permutations with Repetition

When repetition is allowed and order matters, each of the $r$ positions can be filled by any of the $n$ options:

$n^r$

A 4-digit PIN using digits 0-9: $10^4 = 10{,}000$ possible codes.

Combinations with Repetition

When repetition is allowed and order doesn't matter:

$\binom{n+r-1}{r} = \frac{(n+r-1)!}{r!(n-1)!}$

Choosing 3 scoops of ice cream from 5 flavors (you can repeat flavors): $\binom{7}{3} = 35$ possible selections.

Quick decision guide: Ask yourself two questions. (1) Does order matter? If yes, use permutations. If no, use combinations. (2) Is repetition allowed? That determines which formula to use.

Probability Distributions and Expected Values

A probability distribution describes all the possible values a random variable can take and how likely each value is.

A discrete probability distribution assigns probabilities to specific, countable values. For example, the number of heads in 3 coin flips can only be 0, 1, 2, or 3.
A continuous probability distribution assigns probabilities to intervals of values. For example, the time until a lightbulb burns out could be any positive number, so you calculate the probability of it lasting between 900 and 1000 hours rather than exactly 950 hours.

Expected Value

The expected value (also called the mean) tells you the long-run average outcome of a random variable. For a discrete random variable:

$E(X) = \sum_{i=1}^{n} x_i \times P(X = x_i)$

You multiply each possible value by its probability, then add them all up. For example, if a game pays $10 with probability 0.3 and $0 with probability 0.7, the expected value is $10(0.3) + 0(0.7) = \$3$ .

Two useful properties:

Linearity: $E(aX + bY) = aE(X) + bE(Y)$ , where $a$ and $b$ are constants. This always holds, even if $X$ and $Y$ aren't independent.
The expected value of a constant is just that constant: $E(c) = c$ .

Variance and Standard Deviation

Variance measures how spread out the distribution is around the mean:

$Var(X) = E(X^2) - [E(X)]^2$

Standard deviation is the square root of variance: $\sigma = \sqrt{Var(X)}$ . It's in the same units as the original variable, which makes it easier to interpret than variance.

Using Expected Values for Decisions

Expected value is a practical tool for comparing options:

Compare the expected values of different choices to see which one is better on average. For instance, if Investment A has an expected return of $500 and Investment B has an expected return of $400, Investment A is better on average.
Consider the spread too. Two options might have the same expected value but very different levels of risk. The probability distribution helps you assess that.

The law of large numbers ties this together: as you repeat an experiment more and more times, the sample mean gets closer and closer to the expected value. This is why expected value represents the long-run average, not a guarantee for any single trial.

Additional Probability Concepts

Odds express probability as a ratio of an event happening versus not happening. If the probability of rain is 0.25, the odds are $0.25 : 0.75$ , or 1 to 3 (often written 1:3). Odds and probability convey the same information in different formats.

Probability trees are diagrams that map out sequential events branch by branch. Each branch represents a possible outcome, and you multiply along the branches to find the probability of a specific sequence. They're especially helpful for visualizing dependent events, like drawing cards without replacement, where the probabilities change at each step.