Standard deviation is a key measure of data spread. It tells you how far values typically stray from the average. Calculating it involves finding the mean, squaring deviations, and taking the square root of their average.

Z-scores and distribution rules help you interpret that spread. Z-scores express how many standard deviations a value sits from the mean. Chebyshev's Rule and the Empirical Rule describe how much data falls within certain ranges of standard deviations.

Standard Deviation Calculation

Standard deviation measures how spread out a dataset is relative to its mean. A small standard deviation means values cluster tightly around the mean; a large one means they're scattered more widely. It's denoted by $\sigma$ for a population and $s$ for a sample.

Here's how to calculate it:

Find the mean of the dataset.
Subtract the mean from each data point to get the deviations.
Square each deviation (this eliminates negatives so they don't cancel out).
Sum all the squared deviations.
Divide that sum by $N$ (for a population) or by $n - 1$ (for a sample) to get the variance.
Take the square root of the variance to get the standard deviation.

Why $n - 1$ for a sample? A sample tends to underestimate the true population spread. Dividing by $n - 1$ instead of $n$ corrects for that bias. You'll sometimes hear this called Bessel's correction.

Population standard deviation:

$\sigma = \sqrt{\frac{\sum_{i=1}^{N} (x_i - \mu)^2}{N}}$

$\sigma$ : population standard deviation
$x_i$ : individual data points
$\mu$ : population mean
$N$ : number of data points in the population

Sample standard deviation:

$s = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n - 1}}$

$s$ : sample standard deviation
$x_i$ : individual data points
$\bar{x}$ : sample mean
$n$ : number of data points in the sample

When do you use which? If you have data for every member of the group you care about (e.g., weights of all apples harvested from an orchard), that's a population, so use $\sigma$ . If you're working with a subset (e.g., heights of 30 students chosen from a university), that's a sample, so use $s$ .

Standard deviation calculation, Standard Deviation (3 of 4) | Concepts in Statistics

Z-Score Interpretation

A z-score tells you how many standard deviations a data point is from the mean. It's calculated with:

$z = \frac{x - \mu}{\sigma}$

$x$ : the data point
$\mu$ : the population mean
$\sigma$ : the population standard deviation

A positive z-score means the value is above the mean; a negative z-score means it's below. For example, if the mean exam score is 75 with a standard deviation of 10, a student who scored 90 has a z-score of $\frac{90 - 75}{10} = 1.5$ . That score sits 1.5 standard deviations above average.

The real power of z-scores is that they let you compare values from completely different datasets. Suppose you scored 85 on a math test (mean 78, SD 5) and 90 on an English test (mean 82, SD 10). Your math z-score is $\frac{85 - 78}{5} = 1.4$ , and your English z-score is $\frac{90 - 82}{10} = 0.8$ . Even though your English score was numerically higher, you performed relatively better in math.

Z-scores can also be used with a standard normal distribution table to find percentiles and probabilities, which you'll use more in later units.

Standard deviation calculation, Estimating a Population Mean (1 of 3) | Statistics for the Social Sciences

Data Distribution Rules

Two rules describe how much data falls within a certain number of standard deviations from the mean. The key difference: one works for any dataset, and the other only works for normally distributed data.

Chebyshev's Rule (works for any dataset, regardless of shape):

At least 75% of data falls within 2 standard deviations of the mean
At least 89% of data falls within 3 standard deviations of the mean
General formula: at least $1 - \frac{1}{k^2}$ of data falls within $k$ standard deviations, where $k > 1$

Chebyshev's gives you guaranteed minimums. The actual percentages are often higher, but you can always count on at least these amounts.

Empirical Rule (only for bell-shaped/normal distributions):

About 68% of data falls within 1 standard deviation of the mean
About 95% of data falls within 2 standard deviations
About 99.7% of data falls within 3 standard deviations

This is sometimes called the 68-95-99.7 Rule. It's more precise than Chebyshev's, but you can only use it when the data is approximately normal. Heights of adult males follow a normal distribution, so the Empirical Rule applies there. Employee salaries at a company are typically skewed right, so you'd fall back on Chebyshev's Rule instead.

Additional Measures of Spread

Range: The difference between the maximum and minimum values. Simple but sensitive to outliers.
Interquartile range (IQR): The difference between the third quartile (75th percentile) and the first quartile (25th percentile). Because it only looks at the middle 50% of data, it's much more resistant to outliers than the range.
Coefficient of variation (CV): The ratio of the standard deviation to the mean, often expressed as a percentage. It's useful when you want to compare variability between datasets that have different units or very different scales.