Measures of spread tell you how far data points sit from the center. Two datasets can share the same mean but look completely different if one is tightly clustered and the other is widely scattered. Standard deviation, variance, z-scores, and a few other tools let you quantify that variability, compare across datasets, and spot outliers.

more resources to help you study

practice questions

Measures of Spread

Standard Deviation Calculation

Standard deviation measures the typical distance between each data point and the mean. It's the most commonly used measure of spread.

For a sample, the formula is:

$s = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n - 1}}$

For a population, the formula is:

$\sigma = \sqrt{\frac{\sum_{i=1}^{N} (x_i - \mu)^2}{N}}$

Where:

$x_i$ = each individual data point
$\bar{x}$ (sample) or $\mu$ (population) = the mean
$n$ (sample) or $N$ (population) = the number of data points

Notice the denominator difference: samples use $n - 1$ (called Bessel's correction), which adjusts for the fact that a sample tends to underestimate the true population variability. Populations use $N$ because you have every value and no estimation is needed.

How to calculate standard deviation step by step:

Find the mean ( $\bar{x}$ ) of your dataset.
Subtract the mean from each data point to get the deviations: $x_i - \bar{x}$ .
Square each deviation: $(x_i - \bar{x})^2$ .
Sum all the squared deviations.
Divide by $n - 1$ (sample) or $N$ (population). This result is the variance.
Take the square root. This is the standard deviation.

A larger standard deviation means more spread. A smaller one means data clusters tightly around the mean. Keep in mind that outliers inflate standard deviation significantly, since squaring large deviations makes them even larger.

Variance ( $s^2$ or $\sigma^2$ ) is simply the standard deviation before you take the square root (step 5 above). It's useful in many statistical formulas, but its units are squared (e.g., if your data is in inches, variance is in square inches), which makes it harder to interpret directly.

Coefficient of Variation (CV) standardizes the spread relative to the mean:

$CV = \frac{s}{\bar{x}} \times 100\%$

This is useful when you need to compare variability between datasets that have different units or very different means. For example, comparing the spread of exam scores (mean of 75) to the spread of heights in centimeters (mean of 170) only makes sense with a relative measure like CV.

Standard deviation calculation, Introduction to Measures of Spread | Concepts in Statistics

Z-Score Interpretation

A z-score tells you how many standard deviations a data point sits from the mean. It converts any value into a common scale, regardless of the original units.

$z = \frac{x - \bar{x}}{s} \quad \text{(sample)} \qquad z = \frac{x - \mu}{\sigma} \quad \text{(population)}$

A positive z-score means the value is above the mean.
A negative z-score means the value is below the mean.
A z-score of 0 means the value equals the mean.

The real power of z-scores is comparison across different scales. Suppose you scored 82 on a history exam (mean 75, $s = 5$ ) and 90 on a chemistry exam (mean 85, $s = 10$ ). Which performance was stronger relative to the class?

History: $z = \frac{82 - 75}{5} = 1.4$
Chemistry: $z = \frac{90 - 85}{10} = 0.5$

Your history score was 1.4 standard deviations above the class average, while your chemistry score was only 0.5 above. Relative to each class, the history performance was stronger, even though the raw chemistry score was higher.

Outlier detection: Data points with z-scores beyond $+3$ or below $-3$ are commonly flagged as potential outliers, since they fall extremely far from the center of the distribution.

Data Distribution Rules

Two rules help you predict how much data falls within a certain number of standard deviations from the mean. They differ in when you can use them.

Chebyshev's Rule applies to any dataset, regardless of shape. It guarantees that at least $1 - \frac{1}{k^2}$ of the data falls within $k$ standard deviations of the mean, where $k > 1$ .

$k$ (standard deviations)	Minimum % of data within $k$ of the mean
2	$1 - \frac{1}{4} = 75\%$
3	$1 - \frac{1}{9} \approx 88.9\%$
4	$1 - \frac{1}{16} = 93.75\%$

These are minimum guarantees. The actual percentage is often higher, but Chebyshev's gives you a safe lower bound even for skewed or unusual distributions.

The Empirical Rule (68-95-99.7 Rule) is more precise but only applies to distributions that are approximately bell-shaped (normal).

About 68% of data falls within 1 standard deviation of the mean.
About 95% of data falls within 2 standard deviations of the mean.
About 99.7% of data falls within 3 standard deviations of the mean.

When to use which: If you know (or can reasonably assume) your data is approximately normal, use the Empirical Rule for tighter estimates. If the distribution is unknown or clearly non-normal, fall back on Chebyshev's Rule. Chebyshev's is always valid; the Empirical Rule is only valid for bell-shaped data.

Additional Measures of Spread

Range: The difference between the maximum and minimum values. It's the simplest measure of spread but is highly sensitive to outliers, since a single extreme value changes it dramatically.
Interquartile Range (IQR): The difference between the third quartile ( $Q_3$ , 75th percentile) and the first quartile ( $Q_1$ , 25th percentile): $IQR = Q_3 - Q_1$ . The IQR captures the middle 50% of the data and is resistant to outliers, making it a better choice than range for skewed distributions.
Mean Absolute Deviation (MAD): The average of the absolute differences between each data point and the mean: $MAD = \frac{\sum |x_i - \bar{x}|}{n}$ . Because it uses absolute values instead of squaring, MAD is less sensitive to outliers than standard deviation. It's less common in formal statistics but gives an intuitive sense of "average distance from the mean."