Clusters in AP Statistics

In AP Statistics, clusters are distinct groups of data points that separate from the rest of the data in a scatterplot or other graph. They count as an 'unusual feature' when you describe a scatterplot (Topic 2.4) and often hint that a hidden categorical variable is splitting the data into groups.

Verified for the 2027 AP Statistics exam•Last updated June 2026

What are Clusters?

Clusters are clumps of data points that group together with visible gaps between them. On a scatterplot, instead of one smooth cloud of points, you see two or three separate blobs. Under learning objective 2.4.B, a complete description of a scatterplot covers form, direction, strength, and unusual features, and clusters fall under unusual features (along with outliers).

Here's the big idea behind clusters. They usually mean your data isn't one group, it's several groups stacked into the same plot. If you plot income versus years of education and see two clusters, you might be looking at blue-collar and white-collar workers mixed together. The cluster pattern is the data's way of telling you a categorical variable (job type, sex, school, treatment group) is lurking in the background. The right move is often to identify that grouping variable and analyze the groups separately, because a single correlation or regression line fit through two clusters can be seriously misleading.

Why Clusters matter in AP® Statistics

Clusters live in Unit 2: Exploring Two-Variable Data, specifically Topic 2.4, and support learning objective 2.4.B (describe the characteristics of a scatterplot). The essential knowledge spells out the checklist you'll use constantly: form, direction, strength, and unusual features. If you skip unusual features when an FRQ says 'describe the relationship,' you leave points on the table.

Clusters also matter beyond the description itself. They're a warning flag for the rest of Unit 2. A correlation coefficient computed across two clusters can suggest a strong linear relationship that exists between the groups but not within either one. Spotting clusters first protects you from misreading r and your regression line later. And the idea isn't trapped in Unit 2. Clusters show up in one-variable graphs too. The 2019 FRQ Q1 used a histogram of room sizes where recognizing distinct groupings in the distribution was part of the analysis.

Keep studying AP® Statistics Unit 5

Official unit cheatsheet

open one-pager

Practice questions for this topic

Unit 5 study guide

Full AP® Statistics practice exam

How Clusters connect across the course

Outliers (Unit 2)

Outliers and clusters are the two 'unusual features' you check for in every scatterplot. An outlier is a lone point far from the pattern, while a cluster is a whole group of points sitting apart. Same checklist item, different scale.

Scatterplot (Unit 2)

Clusters only exist as a feature of a graph, usually a scatterplot. You can't see them in summary statistics, which is exactly why AP Stats hammers 'always plot your data first.'

Correlation Coefficient (Unit 2)

Two tight clusters at opposite corners of a plot can produce a high r even if neither cluster shows any relationship on its own. Clusters are one of the classic ways a correlation coefficient lies to you.

Describing Distributions (Unit 1)

Clusters appear in histograms and dotplots too, where they create gaps and sometimes bimodal shapes. The 2019 FRQ on room sizes in a residence hall is a one-variable version of the same skill: noticing that the data forms separate groups.

Are Clusters on the AP® Statistics exam?

On multiple choice, cluster questions usually hand you a scenario ('a scatterplot of income vs. education shows two distinct clusters') and ask what it means or what to do next. The tested answer is almost always some version of 'a categorical variable is splitting the data into groups, so investigate or analyze the groups separately.' Distractors will tempt you to just compute one overall correlation or delete points, both wrong moves.

On FRQs, clusters show up when you're asked to describe a graph. The full scatterplot description is form, direction, strength, and unusual features, in context. If clusters or gaps are visible and you don't mention them, you typically can't earn full credit for the description. The 2019 FRQ Q1 histogram of room sizes is a good example of the exam rewarding you for noticing distinct groupings, not just citing center and spread.

Clusters vs Outliers

Both are 'unusual features,' but an outlier is a single point that doesn't fit the overall pattern, while a cluster is a separate group of points with a gap between it and the rest of the data. The fix is different too. An outlier makes you ask 'is this point an error or a special case?' A cluster makes you ask 'is there a hidden categorical variable creating two populations in my data?'

Key things to remember about Clusters

Clusters are distinct groups of points separated by gaps in a scatterplot, histogram, or dotplot.
When you describe a scatterplot on the AP exam, cover form, direction, strength, and unusual features, and clusters belong in that last category.
Clusters usually signal a lurking categorical variable, like two job types or two age groups mixed into one dataset.
The best next step after spotting clusters is to identify the grouping variable and consider analyzing each cluster separately.
A correlation or regression line computed across multiple clusters can be misleading, showing a relationship between groups that doesn't exist within them.
Clusters are groups while outliers are individual points, so don't use the terms interchangeably in an FRQ response.

Frequently asked questions about Clusters

What are clusters in AP Stats?

Clusters are distinct groups of data points separated by gaps in a graph, most often a scatterplot. In Topic 2.4 they count as an 'unusual feature' you mention when describing the relationship between two quantitative variables.

Do clusters mean my data is bad?

No. Clusters aren't errors, they're information. They usually mean a categorical variable (like worker type or grade level) is splitting your data into subgroups, and the smart move is to identify that variable and analyze the groups separately.

What's the difference between a cluster and an outlier?

An outlier is one point that falls far from the overall pattern, while a cluster is a whole group of points sitting apart from the rest. Both are 'unusual features' in a scatterplot description, but they suggest different follow-up questions.

What should I do if a scatterplot shows two distinct clusters?

Look for the categorical variable creating the split and consider describing or modeling each group separately. Fitting one regression line or reporting one correlation across both clusters can exaggerate or hide the real relationship.

Do I have to mention clusters when describing a scatterplot on the FRQ?

If clusters are visible, yes. A full-credit description addresses form, direction, strength, and unusual features in context, and ignoring an obvious gap or grouping is a common way to lose points.

Keep studying AP Statistics

Connect this key term to the AP exam workflow: review the course, practice questions, and check related study tools.

AP Statistics hub

Review units, study guides, and course resources.

AP-style practice

Check this vocabulary in multiple-choice context.

FRQ practice

Apply key concepts in written AP responses.

score calculator

Estimate the exam score you are working toward.

cheatsheets

Review the highest-yield facts before practice.

practice exam

Put the full course together before test day.