Box Plots
Box plots give you a visual snapshot of how data is distributed using just five key numbers. They're one of the fastest ways to compare datasets side by side and spot differences in center, spread, and skewness at a glance.
Construction of Box Plots
A box plot is built from the five-number summary:
- Minimum: the smallest data point (excluding outliers)
- First quartile (Q1): the median of the lower half of the data
- Median (Q2): the middle value when data is sorted
- Third quartile (Q3): the median of the upper half of the data
- Maximum: the largest data point (excluding outliers)
To draw a box plot, follow these steps:
-
Draw a number line that covers the full range of your data.
-
Draw a box from Q1 to Q3. This box represents the interquartile range (IQR), which contains the middle 50% of the data.
-
Draw a vertical line inside the box at the median (Q2).
-
Calculate the IQR:
-
Determine the whisker boundaries. The whiskers extend from the box to the smallest and largest data points that fall within of the quartiles. That means the lower whisker reaches down to the smallest value ≥ , and the upper whisker reaches up to the largest value ≤ .
-
Plot any data points beyond the whiskers as individual dots or asterisks. These are outliers.
For example, if Q1 = 20 and Q3 = 40, then . Any value below or above would be marked as an outlier.

Interpretation of Box Plot Distributions
Each part of a box plot tells you something specific about the data:
The box (IQR) shows where the middle 50% of values fall. A narrow box means data is tightly clustered around the median, while a wide box means values are more spread out.
The median line reveals both center and skewness. If the median sits roughly in the middle of the box, the distribution is approximately symmetric. If it's closer to Q1, the data is right-skewed (the longer tail stretches toward higher values). If it's closer to Q3, the data is left-skewed (the longer tail stretches toward lower values).
A common mix-up: when the median is closer to Q1, students sometimes think "left-skewed" because the median is on the left side of the box. But the tail extends to the right, so it's actually right-skewed (positive skew). Focus on which direction the longer tail points, not where the median sits.
The whiskers show the range of non-outlier data. Unequal whisker lengths are another indicator of skewness. A much longer upper whisker suggests right skew.
Outliers are individual points plotted beyond the whiskers. They may represent unusual observations, data entry errors, or genuinely extreme values. Don't automatically throw them out, but do investigate them.

Comparison of Datasets Using Box Plots
When you place two or more box plots on the same scale, you can quickly compare:
- Central tendency: Compare median positions. A dataset whose median line sits higher on the scale has a higher typical value.
- Spread: Compare box widths and whisker lengths. A wider box and longer whiskers mean more variability. For instance, if Class A's box spans from 70 to 85 and Class B's spans from 60 to 95, Class B has much more variation in scores.
- Skewness: Check where the median falls within each box and whether the whiskers are symmetric. One dataset might be roughly symmetric while another is clearly skewed.
- Outliers: Note which datasets have outliers and where they fall. A dataset with several high outliers might have a few unusually extreme values pulling the mean above the median.
Box Plots as Descriptive Statistics
Box plots are a form of descriptive statistics because they summarize a dataset's key features visually rather than with a single number. They show center, spread, skewness, and outliers all in one compact graphic. Pairing box plots with other displays like histograms gives you a more complete picture: histograms show the detailed shape of the distribution, while box plots make side-by-side comparisons much easier to read.