Expected Value and Standard Deviation
Expected value and standard deviation let you summarize a discrete random variable with just two numbers: where the distribution centers and how spread out it is. These tools are essential for predicting outcomes and comparing different probability distributions.

Expected Value of Discrete Variables
The expected value of a discrete random variable , written or , is the long-run average you'd observe if you repeated the random process many times. You calculate it by multiplying each possible value by its probability, then adding up all those products.
For a fair six-sided die:
Notice that 3.5 isn't even a value you can roll. The expected value doesn't have to be a possible outcome; it's the balancing point of the distribution. Think of it as where the probability distribution would balance if you placed it on a number line.
The probability mass function (PMF) is what gives you the probability for each possible value of a discrete random variable. It's the table or formula you plug into the expected value calculation.

Law of Large Numbers Interpretation
The law of large numbers says that as you repeat a random process more and more times, the average of your observed results gets closer and closer to the expected value.
- Flip a fair coin 10 times and you might get 70% heads. Flip it 10,000 times and the proportion of heads will be very close to 0.5.
- Roll a die 5 times and your average might be 4.2. Roll it 5,000 times and the average will hover near 3.5.
This is why is called the "expected" value: it's what you expect to see on average over many trials. It also explains why larger sample sizes give more reliable estimates of population parameters.

Variance and Standard Deviation Computation
Expected value tells you the center, but two distributions can share the same mean and look completely different. Variance and standard deviation capture how spread out the values are around that center.
Variance is the average squared deviation from the mean:
Standard deviation is the square root of the variance, which brings the units back to the same scale as :
Here's how to compute them step by step:
-
Find the mean using the method above.
-
Subtract the mean from each possible value to get the deviation: .
-
Square each deviation: .
-
Multiply each squared deviation by its probability: .
-
Sum all the products to get the variance.
-
Take the square root of the variance to get the standard deviation.
Worked example: Suppose has this distribution: , , .
First, find the mean:
Then compute the variance:
A larger standard deviation means the values are more spread out from the mean; a smaller one means they cluster tightly around it.
Population and Sample Statistics
- Population parameters (like and ) describe the entire population. When you know the full probability distribution, you can calculate these exactly using the formulas above.
- Sample statistics (like and ) are calculated from a subset of data and serve as estimates of the population parameters.
In this unit, you're typically working with known probability distributions, so you're computing population parameters directly. Later in the course, you'll work more with sample statistics and the challenge of estimating parameters from incomplete data.