🔢Lower Division Math Foundations

Common Statistical Measures

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Statistics isn't just about crunching numbers—it's about extracting meaning from data. Every statistical measure you learn serves a specific purpose: some tell you where the "center" of your data lives, others reveal how spread out or clustered your values are, and still others help you understand relationships between variables. On exams, you're being tested on your ability to choose the right measure for the situation and interpret what that measure actually tells you about the underlying data.

The key concepts here fall into three categories: measures of central tendency (where's the middle?), measures of spread (how variable is the data?), and measures of position and relationship (where does a value rank, and how do variables relate?). Don't just memorize formulas—know when each measure is appropriate, what makes it sensitive to outliers, and how to interpret results in context.

Measures of Central Tendency

These measures answer the fundamental question: what's a typical value in this dataset? Each one captures "center" differently, and choosing the right one depends on your data's shape and what you're trying to communicate.

Mean

Sum divided by count—the arithmetic average, calculated as $\bar{x} = \frac{\sum x_i}{n}$ , gives equal weight to every data point
Highly sensitive to outliers—a single extreme value can drag the mean far from where most data clusters
Best for symmetric distributions—when data is roughly balanced, the mean accurately represents the center

Median

The middle value when data is sorted—for odd $n$ , it's the center value; for even $n$ , average the two middle values
Resistant to outliers—extreme values don't affect the median since it only depends on position, not magnitude
Preferred for skewed data—income, home prices, and other right-skewed distributions are better represented by median

Mode

Most frequently occurring value—the only central tendency measure that works for categorical data
Can be unimodal, bimodal, or multimodal—datasets may have one peak, two peaks, or several, revealing distribution shape
Essential for categorical analysis—when you need the most common category (favorite color, most popular product), mode is your tool

Compare: Mean vs. Median—both measure center, but mean uses all values while median uses only position. When an FRQ gives you a skewed distribution or mentions outliers, median is usually the better choice for representing "typical."

Measures of Spread

Knowing the center isn't enough—you need to understand how much variation exists around that center. These measures quantify dispersion, from simple to sophisticated.

Range

Maximum minus minimum—the simplest spread measure, calculated as $\text{Range} = x_{max} - x_{min}$
Uses only two data points—ignores everything between the extremes, missing the full picture of variability
Highly sensitive to outliers—one unusual value completely changes the range, limiting its usefulness

Variance

Average squared deviation from the mean—calculated as $\sigma^2 = \frac{\sum(x_i - \bar{x})^2}{n}$ for populations (use $n-1$ for samples)
Squaring eliminates negative differences—ensures deviations don't cancel out, but produces units that are squared
Foundation for standard deviation—variance is mathematically useful but harder to interpret directly

Standard Deviation

Square root of variance— $\sigma = \sqrt{\sigma^2}$ returns the spread measure to original units
Measures typical distance from the mean—tells you how far a "normal" data point sits from center
Key to the empirical rule—in normal distributions, about 68% of data falls within $\pm 1\sigma$ , 95% within $\pm 2\sigma$

Compare: Range vs. Standard Deviation—range is quick but crude (two points only), while standard deviation incorporates every data point. If an exam asks for a "robust" or "reliable" measure of spread, standard deviation wins.

Interquartile Range

Difference between Q3 and Q1—calculated as $\text{IQR} = Q_3 - Q_1$ , capturing the middle 50% of data
Resistant to outliers—since it ignores the top and bottom 25%, extreme values don't affect it
Used to identify outliers—values below $Q_1 - 1.5 \times \text{IQR}$ or above $Q_3 + 1.5 \times \text{IQR}$ are flagged as outliers

Compare: Standard Deviation vs. IQR—both measure spread, but SD is sensitive to outliers while IQR is resistant. For skewed data or when outliers are present, IQR gives a more accurate picture of typical variability.

Measures of Position

These measures tell you where a specific value ranks within the larger dataset—essential for comparing individuals across different scales or distributions.

Percentiles

Divide data into 100 equal parts—the $n$ th percentile means $n$ % of values fall below that point
Useful for standardized comparisons—test scores, growth charts, and rankings all use percentiles
The 50th percentile equals the median—this connection helps you interpret percentiles intuitively

Quartiles

Divide data into four equal parts— $Q_1$ (25th percentile), $Q_2$ (50th/median), and $Q_3$ (75th percentile)
Foundation for box plots—the five-number summary (min, $Q_1$ , median, $Q_3$ , max) creates this visualization
Quick distribution snapshot—quartiles reveal skewness and help identify potential outliers at a glance

Compare: Percentiles vs. Quartiles—quartiles are just specific percentiles (25th, 50th, 75th). Know that $Q_2$ is always the median, and IQR is always $Q_3 - Q_1$ .

Measures of Relationship

When you have two variables, you need tools to understand how they move together—this is where correlation comes in.

Correlation Coefficient

Measures linear relationship strength and direction—denoted $r$ , values range from $-1$ to $+1$
Sign indicates direction—positive $r$ means variables increase together; negative $r$ means one increases as the other decreases
Magnitude indicates strength— $|r|$ close to 1 means strong linear relationship; close to 0 means weak or no linear relationship

Compare: Correlation vs. Causation—a high $|r|$ value does NOT prove one variable causes changes in another. This is a classic exam trap: correlation describes association, not causation.

Quick Reference Table

Concept	Best Examples
Central tendency (balanced data)	Mean
Central tendency (skewed/outliers)	Median
Central tendency (categorical)	Mode
Simple spread measure	Range
Spread in original units	Standard Deviation
Spread resistant to outliers	IQR
Position within distribution	Percentiles, Quartiles
Relationship between variables	Correlation Coefficient

Self-Check Questions

A dataset of home prices includes one mansion worth $15 million. Which measure of central tendency would best represent a "typical" home price, and why?
Compare variance and standard deviation: why do we bother calculating standard deviation when variance already measures spread?
A student scores in the 85th percentile on an exam. Another student's score equals $Q_3$ . Which student performed better, and how do you know?
You're analyzing two datasets with identical means. Dataset A has $\sigma = 2$ and Dataset B has $\sigma = 10$ . What does this tell you about how the data points are distributed differently?
Two variables have a correlation coefficient of $r = -0.92$ . Describe the relationship and explain why this does NOT prove that changes in one variable cause changes in the other.

🔢Lower Division Math Foundations

Common Statistical Measures

Why This Matters

Measures of Central Tendency

Mean

Median

Mode

Measures of Spread

Range

Variance

Standard Deviation

Interquartile Range

Measures of Position

Percentiles

Quartiles

Measures of Relationship

Correlation Coefficient

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes