Why This Matters
Expected value is the backbone of probability theory—it's how we quantify what "should" happen on average when randomness is involved. Whether you're analyzing casino games, insurance policies, or scientific experiments, expected value gives you the single number that summarizes a random variable's long-run behavior. You're being tested on your ability to calculate expected values across different distribution types, apply key properties like linearity and conditional expectation, and recognize which formula fits which scenario.
Don't just memorize these formulas in isolation. The real exam skill is understanding when to use each one and why it works. Can you recognize a binomial setup and immediately write down E(X)=np? Do you know why linearity of expectation is so powerful? Focus on connecting each formula to the underlying principle it represents—that's what separates a 5 from a 3.
Core Definitions: Discrete vs. Continuous
The fundamental expected value formulas differ based on whether your random variable takes countable values or spans a continuous range. The discrete case uses summation; the continuous case uses integration.
Expected Value of a Discrete Random Variable
- E(X)=∑xi⋅P(xi)—multiply each possible value by its probability, then sum all products
- Weighted average interpretation—values with higher probabilities contribute more to the expected value
- Center of distribution—represents where the probability "mass" balances, useful for comparing random variables
Expected Value of a Continuous Random Variable
- E(X)=∫−∞∞x⋅f(x)dx—where f(x) is the probability density function (PDF)
- Balance point concept—if you cut out the PDF shape from cardboard, the expected value is where it would balance
- Analogous to discrete case—integration replaces summation when dealing with uncountably many possible values
Compare: Discrete vs. Continuous EV—both calculate a weighted average, but discrete uses ∑ while continuous uses ∫. If an FRQ gives you a PDF, you're integrating; if it gives you a probability table, you're summing.
Key Properties: Simplifying Calculations
These properties are your computational shortcuts. Linearity especially shows up constantly because it works even when variables aren't independent.
Linearity of Expectation
- E(X+Y)=E(X)+E(Y)—always true, regardless of independence
- Extends to constants—E(aX+b)=aE(X)+b for any constants a and b
- Problem-solving power—break complex random variables into simpler pieces, find each EV, then add
Expected Value of a Function of a Random Variable
- E(g(X))=∑g(xi)P(xi) for discrete; E(g(X))=∫g(x)f(x)dx for continuous
- LOTUS (Law of the Unconscious Statistician)—you don't need the distribution of g(X), just apply g inside the expectation
- Common application—finding E(X2) to calculate variance via Var(X)=E(X2)−[E(X)]2
Compare: Linearity vs. LOTUS—linearity handles sums of random variables; LOTUS handles functions of a single random variable. Both simplify calculations, but they solve different problems.
When you know something about one variable, it changes what you expect from another. Conditional expectation formalizes how new information updates your predictions.
Conditional Expected Value
- E(X∣Y=y)=∑xiP(X=xi∣Y=y) for discrete; uses conditional PDF for continuous
- Updates the average—given that Y takes a specific value, what's the new "best guess" for X?
- Measures dependence—if E(X∣Y) varies with Y, then X and Y are dependent
Law of Total Expectation
- E(X)=E(E(X∣Y))—the overall expected value equals the average of conditional expected values
- Partition strategy—break a problem into cases based on Y, solve each case, then recombine
- Also called iterated expectation—essential for multi-stage problems like "first flip a coin, then roll dice based on the result"
Compare: Conditional EV vs. Law of Total Expectation—conditional EV gives you the answer for a specific scenario; the law of total expectation averages across all scenarios. Use the law when you need the unconditional answer but the conditional calculation is easier.
For common distributions, the expected value formulas are derived once and used forever. Know these cold—they're tested directly and save enormous time.
Expected Value of a Binomial Distribution
- E(X)=np—where n is the number of trials and p is the probability of success per trial
- Intuitive meaning—in 100 coin flips with p=0.5, expect 50 heads on average
- Derived from linearity—a binomial is the sum of n independent Bernoulli trials, each with E=p
Expected Value of a Poisson Distribution
- E(X)=λ—where λ is the average rate of occurrence
- Parameter does double duty—for Poisson, the mean equals the variance equals λ
- Models rare events—number of emails per hour, accidents per month, typos per page
Compare: Binomial vs. Poisson—binomial has a fixed number of trials; Poisson models events in a continuous interval. As n→∞ and p→0 with np=λ, binomial approaches Poisson.
Expected Value of an Exponential Distribution
- E(X)=λ1—where λ is the rate parameter
- Waiting time interpretation—average time until the next event in a Poisson process
- Memoryless property connection—the exponential distribution "forgets" how long you've already waited
- E(X)=2a+b—simply the midpoint of the interval [a,b]
- Symmetry argument—all values equally likely, so the average is the center
- Quick sanity check—if your uniform is on [0,10], expected value must be 5
Compare: Exponential vs. Uniform—both are continuous, but exponential is skewed right (most values near zero) while uniform is symmetric. Their expected value formulas reflect this: exponential depends only on rate, uniform depends only on boundaries.
Quick Reference Table
|
| Discrete EV formula | E(X)=∑xiP(xi) |
| Continuous EV formula | E(X)=∫xf(x)dx |
| Linearity property | E(X+Y)=E(X)+E(Y), E(aX+b)=aE(X)+b |
| Function of RV (LOTUS) | E(g(X)) using original distribution |
| Conditional expectation | E(X∣Y), Law of Total Expectation |
| Binomial | E(X)=np |
| Poisson | E(X)=λ |
| Exponential | E(X)=1/λ |
| Uniform | E(X)=(a+b)/2 |
Self-Check Questions
-
What property allows you to calculate E(X1+X2+⋯+Xn) without knowing whether the variables are independent?
-
Compare the expected value formulas for Poisson and exponential distributions—how are their parameters related, and why does this make sense given the Poisson process interpretation?
-
If X is uniform on [2,8] and Y is binomial with n=12 and p=0.5, which has the larger expected value? Show your work.
-
Explain when you would use the Law of Total Expectation instead of calculating E(X) directly. Give an example scenario.
-
You need to find E(X2) for a discrete random variable. Would you use linearity of expectation or LOTUS? Write the formula you'd apply.