upgrade
upgrade

๐Ÿ“ŠAP Statistics

Measures of Central Tendency

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

When you encounter a dataset on the AP Statistics exam, your first task is almost always to describe its centerโ€”but which measure of center you choose reveals how well you understand the data's shape and context. The College Board tests your ability to select the appropriate measure based on distribution characteristics like skewness, outliers, and data type, not just your ability to calculate a mean. This connects directly to Unit 1's emphasis on describing distributions and carries through to sampling distributions in Unit 5, where the sample mean xห‰\bar{x} becomes the foundation for inference.

Beyond calculation, you're being tested on statistical reasoning: When does the median tell a more honest story than the mean? How do outliers pull the mean but leave the median unchanged? Why does the Central Limit Theorem focus on the mean rather than other measures? Don't just memorize formulasโ€”know what concept each measure illustrates and when each is the most appropriate choice for summarizing data.


Measures That Use Every Data Point

These measures incorporate all values in the dataset, making them mathematically complete but also vulnerable to the influence of extreme values. The key principle: using all data maximizes information but can distort the "typical" value when distributions are skewed.

Mean (Arithmetic Average)

  • Calculated as xห‰=โˆ‘xin\bar{x} = \frac{\sum x_i}{n}โ€”the sum of all values divided by the count, giving equal weight to every observation
  • Sensitive to outliers and skewness, which pull the mean toward the tailโ€”a critical concept for choosing appropriate summary statistics
  • Foundation for inferential statisticsโ€”the sample mean xห‰\bar{x} estimates the population mean ฮผ\mu and is central to the Central Limit Theorem and hypothesis testing

Weighted Mean

  • Assigns different weights to values based on importanceโ€”calculated as xห‰w=โˆ‘wixiโˆ‘wi\bar{x}_w = \frac{\sum w_i x_i}{\sum w_i}, where weights reflect relative significance
  • Essential for combining group means or calculating GPAs where courses have different credit hours
  • Appears in AP contexts when you need to find an overall mean from grouped data or combine statistics from subgroups

Compare: Mean vs. Weighted Meanโ€”both use all data points, but the weighted mean acknowledges that some observations matter more. On FRQs involving stratified samples or combining statistics from subgroups, the weighted mean is your tool.


Measures Resistant to Outliers

These measures remain stable even when extreme values are present, making them robust statistics. The key principle: resistance comes from ignoring or minimizing the influence of values far from the center.

Median

  • The middle value when data is orderedโ€”for odd nn, it's the center value; for even nn, it's the mean of the two center values
  • Resistant to outliers and skewness, making it the preferred measure for income, home prices, or any right-skewed distribution
  • Compare to mean for shape detectionโ€”when mean > median, the distribution is right-skewed; when mean < median, it's left-skewed

Trimmed Mean

  • Removes a fixed percentage of extreme values from both ends before calculating the meanโ€”typically 5% or 10% on each side
  • Balances information and resistanceโ€”uses more data than the median but isn't fooled by outliers like the regular mean
  • Useful for understanding robustness concepts, though less commonly tested than mean and median

Compare: Mean vs. Medianโ€”both measure center, but the mean uses all values while the median uses only position. If an FRQ asks you to justify which measure better represents "typical," check for skewness or outliersโ€”the median wins when either is present.


Measures for Categorical and Frequency Data

These measures apply when data involves categories or repeated values rather than continuous measurements. The key principle: some data types restrict which measures are mathematically meaningful.

Mode

  • The most frequently occurring valueโ€”the only measure of center applicable to nominal (categorical) data
  • Describes distribution shapeโ€”unimodal (one peak), bimodal (two peaks), or multimodal (multiple peaks) distributions
  • Less useful for continuous data where values rarely repeat exactly, but essential for identifying clusters or popular categories

Compare: Mode vs. Mean/Medianโ€”the mode is the only measure you can use for categorical data like "favorite color" or "political party." When describing a distribution's shape, use modal language (unimodal, bimodal) rather than mean or median language.


Specialized Means for Specific Contexts

These variations of the mean are designed for particular types of data or relationships. The key principle: the arithmetic mean isn't always the most meaningful averageโ€”context determines the appropriate calculation.

Geometric Mean

  • Calculated as x1โ‹…x2โ‹…...โ‹…xnn\sqrt[n]{x_1 \cdot x_2 \cdot ... \cdot x_n}โ€”the nth root of the product of all values
  • Appropriate for multiplicative relationships like growth rates, investment returns, or ratios across time periods
  • Always less than or equal to the arithmetic meanโ€”this relationship (AM-GM inequality) is a mathematical property you should recognize

Harmonic Mean

  • Calculated as nโˆ‘1xi\frac{n}{\sum \frac{1}{x_i}}โ€”the reciprocal of the mean of reciprocals
  • Best for averaging rates when the denominator variesโ€”like speeds over equal distances or prices per unit
  • Emphasizes smaller values, producing an average lower than both the arithmetic and geometric means

Compare: Arithmetic vs. Geometric vs. Harmonic Meanโ€”for the same positive dataset, Harmonic โ‰ค Geometric โ‰ค Arithmetic (with equality only when all values are identical). The arithmetic mean works for additive relationships, geometric for multiplicative, and harmonic for rates with varying denominators.


Quick Calculations with Limited Information

These measures can be computed with minimal data but sacrifice robustness for simplicity. The key principle: ease of calculation often comes at the cost of reliability.

Midrange

  • Calculated as max+min2\frac{\text{max} + \text{min}}{2}โ€”simply the average of the extreme values
  • Highly sensitive to outliers since it uses only the two most extreme data points
  • Rarely preferred in serious analysis but occasionally appears as a quick estimate or in discussions of range-based measures

Quick Reference Table

ConceptBest Examples
Uses all data pointsMean, Weighted Mean
Resistant to outliersMedian, Trimmed Mean
Works with categorical dataMode
Indicates skewness directionMean vs. Median comparison
Multiplicative relationshipsGeometric Mean
Averaging ratesHarmonic Mean
Foundation for inferenceMean (connects to CLT and sampling distributions)
Shape descriptionMode (unimodal, bimodal, multimodal)

Self-Check Questions

  1. A dataset of home prices in a neighborhood includes one mansion worth $5 million among otherwise modest homes. Which measure of central tendency would best represent the "typical" home price, and why?

  2. Compare and contrast the mean and median: Under what conditions will they be approximately equal, and under what conditions will they differ substantially?

  3. A student's grades are: Quiz (weight 10%) = 85, Midterm (weight 30%) = 78, Final (weight 60%) = 92. Why is the weighted mean more appropriate than the arithmetic mean here, and what is the weighted average?

  4. You're told a distribution is right-skewed. Without seeing the data, what can you predict about the relationship between the mean and median? How would this influence your choice of summary statistic?

  5. FRQ-style: A researcher reports both the mean and median income for a sample of workers. The mean is $72,000 and the median is $54,000. Describe the likely shape of the income distribution and explain which measure better represents a "typical" worker's income. Justify your reasoning.