Summary statistics put a single number on the center, position, or spread of a quantitative variable. Center and position come from the mean, median, quartiles, and percentiles, while spread comes from the range, IQR, and standard deviation.
Why This Matters for the AP Statistics Exam
This topic gives you the numerical tools you will use across the entire course. You need to calculate and interpret these statistics, and you also need to explain why one measure fits the data better than another.
On multiple-choice questions, you will compute or compare these values and identify which measures change when outliers or units change. On free-response questions, you will be asked to describe distributions and justify your choice of summary statistics in context, so clear reasoning and correct units matter for full credit. These skills carry into later units when you compare distributions, build sampling distributions, and run inference.

Key Takeaways
- Mean and standard deviation are nonresistant, so outliers pull them. Median and IQR are resistant.
- Use mean and standard deviation for roughly symmetric data; use median and IQR for skewed data or data with outliers.
- The five summary positions you calculate most are the mean, median, Q1, Q3, and percentiles.
- Range, IQR, and standard deviation all measure spread, but they react differently to extreme values.
- Two common outlier rules: more than 1.5 x IQR beyond Q1 or Q3, or more than 2 standard deviations from the mean.
- Changing units of measurement changes the values of your summary statistics, so always report units.
Core Definitions
A statistic is a numerical summary of sample data. A parameter is a numerical summary of a population. In later units you use statistics to make inferences about parameters, but here you are just describing the sample you have.
- The mean, median, quartiles, and percentiles measure center and position.
- The range, IQR, and standard deviation measure variability (spread).
Changing units of measurement changes the values of these statistics, so keep track of units throughout.
Measures of Center and Position
The Mean
The mean (average) is the sum of all values divided by how many there are. For a sample it is denoted :
Here is the th data value and is the number of data values. You read as "x bar." The mean works well as a summary for symmetric distributions because it acts like the balancing point of the data.
The mean has drawbacks. It does not describe individual data points on its own, and it is nonresistant, meaning a single very large or very small value can pull it noticeably. If you report the mean for data with a strong outlier, you can give a misleading sense of the center.
The Median
The median is the middle value when the data are ordered. If the number of values is odd, it is the single middle value. If the number of values is even, the commonly used median is the average of the two middle values.
The median is resistant to outliers, so it is a good summary of center for skewed distributions or distributions with extreme values. To locate the median's position in an ordered list, a common approach is to use position .
Mean or Median?
Use this as a quick rule of thumb:
- For a symmetric, unimodal distribution, the mean is often the better measure of center because it uses every value.
- For skewed distributions or data with outliers, the median is often better because it resists extreme values.
In a right-skewed distribution the mean is usually greater than the median; in a left-skewed distribution the mean is usually less than the median. Reporting both, with units, gives a fuller picture and helps explain any gap between them.
Quartiles and Percentiles
The first quartile (Q1) is the median of the lower half of the ordered data, from the minimum to the position of the median. The third quartile (Q3) is the median of the upper half, from the position of the median to the maximum. Together, Q1 and Q3 are the boundaries of the middle 50% of the data.
The pth percentile is the value that has p% of the data less than or equal to it. For example, a value at the 90th percentile has 90% of the data at or below it.
Measures of Spread
Range
The range is the difference between the maximum and minimum data values:
It is quick to find but is nonresistant, since it depends entirely on the two most extreme values.
Interquartile Range (IQR)
The interquartile range is the spread of the middle 50% of the data:
The IQR is resistant to outliers because it ignores the extreme ends and focuses on the central half. It does not describe the entire spread, so it is often reported alongside other measures.
Standard Deviation
The standard deviation measures how far values typically fall from the mean. For a sample it is denoted :
The square of the sample standard deviation, , is the sample variance. You will usually use a calculator for this, but knowing the formula helps you understand what it measures.
Why divide by instead of ? A sample is only a subset of the population, so dividing by (the degrees of freedom) adjusts the estimate to better represent the population. Standard deviation is nonresistant, so outliers increase it.
Standard Deviation or IQR?
- For a roughly symmetric, unimodal distribution, report the mean and standard deviation together to describe center and spread.
- For a skewed distribution or one with outliers, report the median and IQR together.
Reporting a measure of center with a matching measure of spread gives a more complete description than reporting only one number.
Identifying Outliers
You already know what outliers are. Two methods are commonly used in this course to decide whether a point counts as one.
Method 1: The 1.5 x IQR Rule
A value is an outlier if it is more than 1.5 x IQR above Q3 or more than 1.5 x IQR below Q1. The bounds are:
- Upper bound:
- Lower bound:
Worked example. Take the data set: 10, 15, 20, 25, 30, 35, 40, 45, 50.
Here Q1 = 20, the median = 30, and Q3 = 40, so the IQR is .
- Upper bound:
- Lower bound:
Any value above 70 or below -10 is an outlier. A value of 100 would be an outlier; a value of 5 would not, since it falls inside the range from -10 to 70.
Method 2: The Standard Deviation Rule
A value is an outlier if it falls more than 2 standard deviations above or below the mean. This method assumes most values sit within two standard deviations of the mean.
Both methods help flag unusual values, but they can give different results. Consider the shape of the data and the goal of your analysis when choosing which to apply.
Resistant and Nonresistant Measures
The mean, standard deviation, and range are nonresistant (non-robust) because outliers influence them. The median and IQR are resistant (robust) because outliers do not greatly affect their values.
This is why the median and IQR are often preferred when a data set may contain outliers: they give a more stable picture of center and spread even when extreme values are present.
How to Use This on the AP Statistics Exam
MCQ
- Match the data shape to the right summary: symmetric leans toward mean and standard deviation, skewed leans toward median and IQR.
- Watch for questions that change units or add an outlier and ask which statistics change. Nonresistant measures (mean, standard deviation, range) react; resistant measures (median, IQR) mostly do not.
- Know how to read Q1, Q3, and percentiles from ordered data.
Free Response
- When you describe a distribution, address center, spread, and any unusual features, and always include units.
- If asked to choose a measure, justify it in context. State why you picked the mean or median and why the alternative is less appropriate.
- Show the setup for outlier checks, such as , so your reasoning is clear.
Common Trap
- Do not report only one number. Pair a center with a matching spread.
- Do not assume the mean and median are equal unless the distribution is roughly symmetric.
Common Misconceptions
- "The mean is always the best measure of center." It is best for symmetric data, but a strong outlier or skew makes the median more reliable.
- "The median is the average of the two middle numbers every time." That is only when the number of values is even. With an odd count, the median is the single middle value.
- "Standard deviation and IQR are interchangeable." They both measure spread, but standard deviation is nonresistant and IQR is resistant, so they respond differently to outliers.
- "A bigger range always means more typical spread." The range uses only the two most extreme values, so a single outlier can inflate it without telling you about the bulk of the data.
- "Changing units does not affect summary statistics." Converting units changes the values of your statistics, so always track and report units.
- "An outlier by the 1.5 x IQR rule must also be an outlier by the standard deviation rule." The two methods can disagree, so the same point may flag under one rule and not the other.
Related AP Statistics Guides
- Unit 1 Overview: Exploring One-Variable Data
- 1.1 Introducing Statistics: What Can We Learn from Data?
- 1.3 Representing a Categorical Variable with Tables
- 1.8 Graphical Representations of Summary Statistics
- 1.9 Comparing Distributions of a Quantitative Variable
- 1.4 Representing a Categorical Variable with Graphs
Vocabulary
The following words are mentioned explicitly in the College Board Course and Exam Description for this topic.Term | Definition |
|---|---|
first quartile | The median of the lower half of an ordered data set, denoted as Q1, marking the boundary below which 25% of the data falls. |
interquartile range | A measure of variability calculated as the difference between the third quartile (Q3) and the first quartile (Q1), representing the spread of the middle 50% of data. |
mean | The average value of a dataset, represented by μ in the context of a population. |
measures of center | Numerical summaries that describe the central tendency of a data set, including the mean and median. |
measures of position | Numerical summaries that describe the location of data values within a distribution, including quartiles and percentiles. |
measures of variability | Statistical measures that describe how spread out or dispersed data values are in a distribution. |
median | The middle value when data are ordered; for an even number of data points, typically the average of the two middle values. |
nonresistant | A characteristic of a statistic that is significantly affected or influenced by outliers; also called non-robust. |
outlier | Data points that are unusually small or large relative to the rest of the data. |
percentile | A value such that p% of the data is less than or equal to it, used to describe the position of a data point within a distribution. |
Q1 | The first quartile; the value below which 25% of the data falls. |
Q3 | The third quartile; the value below which 75% of the data falls. |
quartile | A value that divides an ordered data set into four equal parts; Q1 and Q3 form the boundaries for the middle 50% of values. |
range | A measure of variability calculated as the difference between the maximum and minimum data values in a dataset. |
resistant | A characteristic of a statistic that is not greatly affected by outliers; also called robust. |
sample standard deviation | The standard deviation calculated for a sample, denoted by s, using the formula s = √(1/(n-1) ∑(xᵢ-x̄)²). |
sample variance | The square of the sample standard deviation, denoted by s², representing variability in squared units. |
standard deviation | A measure of how spread out data values are from the mean, represented by σ in the context of a population. |
statistic | Numerical summaries or measures calculated from sample data, such as mean, median, or standard deviation. |
third quartile | The median of the upper half of an ordered data set, denoted as Q3, marking the boundary below which 75% of the data falls. |
Frequently Asked Questions
What are summary statistics in AP Statistics?
Summary statistics are numerical descriptions of a quantitative variable. They describe center, position, and spread using values like mean, median, quartiles, percentiles, range, IQR, variance, and standard deviation. The goal is not just calculating them, but choosing and interpreting the right ones in context.
When should I use mean and standard deviation instead of median and IQR?
Use mean and standard deviation when a distribution is roughly symmetric and has no strong outliers. Use median and IQR when a distribution is skewed or has outliers because they are resistant. On AP Statistics free-response questions, justify the choice using the shape and unusual features of the distribution.
What is the difference between range, IQR, and standard deviation?
Range is maximum minus minimum, so it uses only the two most extreme values. IQR is Q3 minus Q1, so it measures the spread of the middle 50% and is resistant to outliers. Standard deviation measures typical distance from the mean and is nonresistant because extreme values can change it a lot.
How do percentiles and quartiles work?
A percentile tells what percent of data values are at or below a particular value. Quartiles split ordered data into four parts: Q1 is around the 25th percentile, the median is around the 50th percentile, and Q3 is around the 75th percentile. Together, Q1 and Q3 define the middle half of the data.
What outlier rules should I know for AP Statistics?
Two common rules are the 1.5 x IQR rule and the 2 standard deviations rule. For the IQR rule, values below Q1 - 1.5(IQR) or above Q3 + 1.5(IQR) are flagged as potential outliers. Always interpret an outlier in the context of the data set.
How do unit changes affect summary statistics?
Adding or subtracting a constant shifts measures of center and position, such as mean, median, quartiles, and percentiles, but does not change measures of spread like range, IQR, or standard deviation. Multiplying or dividing by a positive constant changes both center and spread by that factor. Units should be included in interpretations.