Scatterplot in AP Statistics

A scatterplot is a graph that displays bivariate quantitative data, plotting one point per individual with the explanatory variable on the x-axis and the response variable on the y-axis. On the AP exam, you describe it using form, direction, strength, and unusual features (DUFS).

Verified for the 2027 AP Statistics exam•Last updated June 2026

What is Scatterplot?

A scatterplot is the standard graph for bivariate quantitative data, meaning two numeric measurements taken on the same individuals. Each dot represents one individual. Its x-coordinate is the value of the explanatory variable (the one used to predict) and its y-coordinate is the value of the response variable (the one being predicted). That setup comes straight from Topic 2.4 of the CED.

When the AP exam asks you to "describe the relationship" in a scatterplot, it wants four specific things: direction (positive or negative association), form (linear or nonlinear), strength (how tightly the points cluster around a pattern), and unusual features (outliers, clusters, gaps). A positive association means y tends to increase as x increases; negative means y tends to decrease. Everything else in Unit 2, from the correlation coefficient to the least-squares regression line to residual plots, is built on top of what you see in this graph. Read the scatterplot first, model second.

Why Scatterplot matters in AP® Statistics

Scatterplots live in Unit 2: Exploring Two-Variable Data, anchoring learning objectives 2.4.A (represent bivariate quantitative data using scatterplots) and 2.4.B (describe form, direction, strength, and unusual features). But they don't stay there. Topic 2.6 uses the scatterplot's pattern to justify a linear regression model, Topic 2.7 turns the scatterplot's leftover vertical distances into residual plots, and Topic 2.9 uses scatterplots to spot outliers, high-leverage points, and influential points. Then Unit 5 brings the scatterplot back for inference. Learning objective the relevant learning objective asks you to identify questions suggested by variation in scatterplots, specifically whether the scatter around a line looks random or non-random. That question is the entire motivation for testing and building interpretation of the slope (the relevant learning objective and the relevant learning objective). If you can read a scatterplot well, you have a head start on roughly a quarter of the course.

Keep studying AP® Statistics Unit 5

Visual cheatsheet

view gallery

Practice questions for this topic

Unit 5 study guide

Full AP® Statistics practice exam

How Scatterplot connects across the course

Residuals and Residual Plots (Unit 2)

A residual plot is what's left of a scatterplot after you subtract out the regression line. Each residual is y minus ŷ, and plotting those leftovers against x tells you whether a line was the right model. Random scatter in the residual plot means the linear form you saw in the scatterplot is legit (LO 2.7.B).

Linear Regression and the LSRL (Unit 2)

The least-squares regression line ŷ = a + bx is fit directly to the scatterplot's points. The scatterplot also defines the safe zone for prediction. Plugging in an x-value beyond the range of the plotted data is extrapolation, and the CED warns those predictions get less reliable the further out you go (2.6.A).

Outliers and Influential Points (Unit 2)

You can only spot regression outliers (big residuals) and high-leverage points (extreme x-values) by looking at the scatterplot. An influential point is one that visibly drags the line, slope, or correlation when removed, which is exactly the kind of before-and-after comparison exam questions love (LO 2.9.A).

regression analysis (Unit 5)

Unit 5 asks the question a single scatterplot can't answer alone. Is the pattern you see real, or just random sampling noise? Variation in points around a line might be random or non-random (the relevant learning objective), and a interpretation of the slope lets you justify a claim about the true population relationship (the relevant learning objective).

Correlation Coefficient (Unit 2)

The correlation r puts a number on the direction and strength you see in a linear scatterplot. But r only makes sense if the form is actually linear, so always look at the plot before trusting the number. A curved pattern can hide behind a high r.

Is Scatterplot on the AP® Statistics exam?

Scatterplots show up almost every year on the FRQ section, usually as Question 1. The 2017 exam gave a scatterplot of wolf length versus weight, 2018 used checkout times versus number of customers in line, and 2022 (bullfrog length versus mass) made you compare a scatterplot of original data to one of transformed data. Typical tasks include describing the association (use direction, form, strength, and unusual features, in context, every time), interpreting slope or predicted values from the plotted line, and judging whether a linear model is appropriate. Multiple-choice questions push the same skills with a twist. You might see a plot where one point's removal drops r from 0.85 to 0.45 (that's an influential point), or a fitted line where points fall below it at low and high x-values but above it in the middle (that's a curved pattern, so the linear model misses the form). The single biggest scoring habit is writing your description in context with variable names, not just "strong positive linear."

Scatterplot vs Residual plot

A scatterplot shows the raw data, y against x, and is where you first describe the association. A residual plot shows the leftovers after fitting a model, residual (y - ŷ) against x or against ŷ. They answer different questions. The scatterplot asks "what does the relationship look like?" while the residual plot asks "was my model the right choice?" A curved pattern can be subtle in a scatterplot but scream at you in the residual plot, which is exactly why the CED makes residual plots the official tool for checking linearity (2.7.B).

Key things to remember about Scatterplot

A scatterplot displays bivariate quantitative data with one point per individual, the explanatory variable on the x-axis, and the response variable on the y-axis.
Describe every scatterplot using four things: direction (positive or negative), form (linear or nonlinear), strength, and unusual features, always in context.
The scatterplot comes before the math. Correlation and the LSRL only make sense after you've confirmed the form looks roughly linear.
Predicting with an x-value outside the range of the scatterplot's data is extrapolation, and those predictions get less reliable the further you go.
Points that don't follow the trend (outliers), points with extreme x-values (high leverage), and points that change the line when removed (influential) are all spotted on the scatterplot.
In Unit 5, scatter around a line raises the inference question of whether the pattern is random or real, which interpretation of the slope help answer.

Frequently asked questions about Scatterplot

What is a scatterplot in AP Stats?

A scatterplot is a graph of bivariate quantitative data where each point represents one individual, with the explanatory variable on the x-axis and the response variable on the y-axis. It's the foundation of Unit 2 and reappears in Unit 5 for regression analysis.

Does a strong pattern in a scatterplot prove the variables are related in the population?

No. A clear pattern in a sample scatterplot could still be random variation, which is the whole point of the relevant current topics. You need inference, like a interpretation of the slope, to justify a claim about the population relationship.

What's the difference between a scatterplot and a residual plot?

A scatterplot graphs the raw data (y versus x), while a residual plot graphs the differences between actual and predicted values (y - ŷ) against x or ŷ. You use the scatterplot to describe the association and the residual plot to check whether a linear model fits.

How do you describe a scatterplot on the AP exam?

Hit four components: direction, form, strength, and unusual features, and name the actual variables in context. For example, "there is a strong, positive, linear association between wolf length and wolf weight, with no obvious outliers."

Can a scatterplot show categorical data?

No. Scatterplots require two quantitative variables. Categorical data gets displayed with bar graphs, mosaic plots, or two-way tables instead, which is a distinction multiple-choice questions like to test.

Keep studying AP Statistics

Connect this key term to the AP exam workflow: review the course, practice questions, and check related study tools.

AP Statistics hub

Review units, study guides, and course resources.

AP-style practice

Check this vocabulary in multiple-choice context.

FRQ practice

Apply key concepts in written AP responses.

score calculator

Estimate the exam score you are working toward.

cheatsheets

Review the highest-yield facts before practice.

practice exam

Put the full course together before test day.