🎲Intro to Statistics

Measures of Dispersion

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

When you describe a dataset, the mean or median only tells half the story—you also need to know how spread out the values are. Measures of dispersion answer the critical question: how much do individual data points vary from the center? This concept underpins everything from understanding sampling variability to interpreting confidence intervals and hypothesis tests. On exams, you'll need to calculate these measures, choose the right one for different situations, and explain what they reveal about your data.

The key insight here is that different measures of spread serve different purposes. Some are sensitive to outliers, others resist them. Some preserve original units, others standardize for comparison. Don't just memorize formulas—understand when to use each measure and what it tells you about the shape and variability of your distribution. That's what separates a correct answer from a complete one.

Simple Range-Based Measures

These measures use specific data points (like maximums, minimums, or quartiles) to capture spread. They're intuitive and easy to calculate, but they don't use every observation in the dataset.

Range

Calculated as maximum minus minimum—the simplest possible measure of spread, requiring only two values
Highly sensitive to outliers since it depends entirely on the most extreme observations in your dataset
Formula: $\text{Range} = \text{Maximum} - \text{Minimum}$ , useful for quick assessments but rarely sufficient alone

Interquartile Range (IQR)

Measures the spread of the middle 50% of data—calculated as the difference between the third and first quartiles
Resistant to outliers because it ignores the extreme values in the upper and lower 25% of the distribution
Formula: $\text{IQR} = Q_3 - Q_1$ , and it's the basis for identifying outliers using the 1.5 × IQR rule

Compare: Range vs. IQR—both measure spread using specific data points, but range uses extremes while IQR uses quartiles. If an FRQ gives you a dataset with obvious outliers and asks which measure better represents typical spread, IQR is your answer.

Deviation-Based Measures

These measures calculate how far each data point falls from the mean, then summarize those deviations. They use every observation, making them more informative but also more sensitive to extreme values.

Variance

Measures the average squared deviation from the mean—squaring eliminates negative values and emphasizes larger deviations
Population variance uses $n$ in the denominator; sample variance uses $n-1$ (Bessel's correction) to produce an unbiased estimate
Formulas: $\sigma^2 = \frac{\sum(x_i - \mu)^2}{N}$ for populations; $s^2 = \frac{\sum(x_i - \bar{x})^2}{n-1}$ for samples

Standard Deviation

The square root of variance—returns the measure of spread to the original units of your data
Interpretation: approximately 68% of data falls within one standard deviation of the mean in a normal distribution (empirical rule)
Formulas: $\sigma = \sqrt{\sigma^2}$ for populations; $s = \sqrt{s^2}$ for samples—know which symbol to use

Mean Absolute Deviation (MAD)

Averages the absolute deviations from the mean—uses $|x_i - \bar{x}|$ instead of squared differences
More intuitive interpretation than variance since it represents the typical distance from the mean in original units
Less commonly used on AP exams but important conceptually—it's less sensitive to outliers than standard deviation

Compare: Variance vs. Standard Deviation—variance squares the units (making interpretation awkward), while standard deviation restores original units. Always report standard deviation when describing spread; use variance primarily in calculations and statistical formulas.

Standardized and Relative Measures

These measures allow you to compare variability across datasets with different scales or units—essential when asking "which dataset is relatively more spread out?"

Coefficient of Variation (CV)

Calculated as standard deviation divided by the mean, expressed as a percentage: $CV = \frac{s}{\bar{x}} \times 100\%$
Enables comparison across different scales—you can compare variability in heights (cm) versus weights (kg)
Only meaningful for ratio data with a true zero point; don't use CV when the mean can be zero or negative

Compare: Standard Deviation vs. Coefficient of Variation—standard deviation measures absolute spread in original units, while CV measures relative spread as a percentage of the mean. Use CV when comparing variability between datasets with different units or vastly different means.

Position-Based Measures

These measures describe where data points fall within the distribution, helping you understand both spread and relative standing.

Percentiles and Quartiles

Percentiles divide data into 100 equal parts—the $k$ th percentile is the value below which $k\%$ of observations fall
Quartiles are special percentiles: $Q_1$ (25th percentile), $Q_2$ (median, 50th percentile), and $Q_3$ (75th percentile)
Five-number summary uses minimum, $Q_1$ , median, $Q_3$ , and maximum to describe distribution shape and spread

Compare: Percentiles vs. Z-scores—both describe position, but percentiles tell you what percentage of data falls below a value, while z-scores tell you how many standard deviations a value is from the mean. Percentiles work for any distribution; z-scores assume you're working with the standard deviation.

Quick Reference Table

Concept	Best Examples
Simple spread (uses extremes)	Range
Resistant to outliers	IQR, MAD
Uses all data points	Variance, Standard Deviation, MAD
Same units as original data	Standard Deviation, MAD, Range, IQR
Squared units	Variance
Comparing across different scales	Coefficient of Variation
Describes position in distribution	Percentiles, Quartiles
Used in the 1.5 × IQR outlier rule	IQR, $Q_1$ , $Q_3$

Self-Check Questions

A dataset contains one extreme outlier. Which two measures of spread would be most affected, and which two would be most resistant?
You're comparing the variability of test scores (mean = 75, SD = 10) with the variability of reaction times in milliseconds (mean = 250, SD = 40). Which dataset shows greater relative variability, and what measure would you use to determine this?
Explain why sample variance divides by $n-1$ instead of $n$ . What problem does this correction solve?
Compare and contrast standard deviation and IQR as measures of spread. Under what conditions would you choose one over the other?
If a value falls at the 90th percentile, what does this tell you? How would you describe this same value's position using quartiles?

🎲Intro to Statistics

Measures of Dispersion

Why This Matters

Simple Range-Based Measures

Range

Interquartile Range (IQR)

Deviation-Based Measures

Variance

Standard Deviation

Mean Absolute Deviation (MAD)

Standardized and Relative Measures

Coefficient of Variation (CV)

Position-Based Measures

Percentiles and Quartiles

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes