Expected value and standard deviation describe two features of a discrete random variable: where the center is and how spread out the values are. Together, they summarize a probability distribution and make it easier to compare outcomes.

Mean of Discrete Probability Distributions

The mean (or expected value) of a discrete random variable is the long-run average you'd expect after repeating an experiment many, many times. It's denoted by $\mu$ or $E(X)$ .

You calculate it by multiplying each possible outcome by its probability, then adding all those products together:

$E(X) = \mu = \sum_{i=1}^{n} x_i \cdot P(X = x_i)$

$x_i$ = each possible value the random variable can take
$P(X = x_i)$ = the probability of that value occurring
$n$ = the total number of possible values

Example: Rolling a fair die. Each face (1 through 6) has probability $\frac{1}{6}$ , so:

$E(X) = 1\left(\frac{1}{6}\right) + 2\left(\frac{1}{6}\right) + 3\left(\frac{1}{6}\right) + 4\left(\frac{1}{6}\right) + 5\left(\frac{1}{6}\right) + 6\left(\frac{1}{6}\right) = 3.5$

Notice that 3.5 isn't a value you can actually roll. That's fine. The expected value doesn't have to be a possible outcome; it represents the theoretical average over many rolls.

Mean of discrete probability distributions, Discrete Random Variables (5 of 5) | Concepts in Statistics

Standard Deviation of Discrete Distributions

The standard deviation ( $\sigma$ ) tells you how far values typically fall from the mean. A small standard deviation means outcomes cluster tightly around $\mu$ ; a large one means they're more spread out.

To get there, you first calculate the variance ( $\sigma^2$ ):

$Var(X) = \sigma^2 = \sum_{i=1}^{n} (x_i - \mu)^2 \cdot P(X = x_i)$

Then the standard deviation is the square root of the variance: $\sigma = \sqrt{\sigma^2}$ .

Steps to calculate standard deviation:

Compute the mean $\mu$ using the expected value formula.
For each possible value $x_i$ , subtract the mean: $(x_i - \mu)$ .
Square each of those differences: $(x_i - \mu)^2$ .
Multiply each squared difference by its probability: $(x_i - \mu)^2 \cdot P(X = x_i)$ .
Add up all those products. This sum is the variance.
Take the square root of the variance to get the standard deviation.

Example: Suppose a random variable $X$ has this distribution:

$x$	1	3	5
$P(X = x)$	0.2	0.5	0.3

Mean: $\mu = 1(0.2) + 3(0.5) + 5(0.3) = 0.2 + 1.5 + 1.5 = 3.2$
Variance: $(1-3.2)^2(0.2) + (3-3.2)^2(0.5) + (5-3.2)^2(0.3) = (4.84)(0.2) + (0.04)(0.5) + (3.24)(0.3) = 0.968 + 0.02 + 0.972 = 1.96$
Standard deviation: $\sigma = \sqrt{1.96} = 1.4$

A common mistake is forgetting to weight by the probabilities. You're not just averaging the squared differences; you're taking a probability-weighted average.

Mean of discrete probability distributions, Average - Wikipedia

Law of Large Numbers Interpretation

The Law of Large Numbers is what connects expected value to the real world. It says that as you repeat an experiment more and more times, the sample mean (the actual average of your results) gets closer and closer to the expected value $\mu$ .

This also applies to relative frequencies. As the number of trials grows, the relative frequency of an event approaches its true probability:

$\text{Relative frequency of A} = \frac{\text{Number of times A occurs}}{\text{Total number of trials}}$

Example: Flip a fair coin 10 times and you might get 7 heads (relative frequency = 0.70). Flip it 10,000 times and the relative frequency of heads will be very close to 0.50. The expected value doesn't guarantee any single result; it describes what happens in the long run.

This is why expected value matters: it's not a prediction for one trial, but a reliable summary of what happens over many trials.