Linear Relationship

A linear relationship is an association between two quantitative variables whose pattern follows a straight line on a scatterplot, so equal changes in x are associated with roughly equal changes in y. Its direction and strength are quantified by the correlation coefficient, r, which runs from -1 to 1.

Verified for the 2027 AP Statistics examLast updated June 2026

What is Linear Relationship?

A linear relationship means that when you plot two quantitative variables on a scatterplot, the points cluster around a straight line. Equal step-ups in x come with roughly equal step-ups (or step-downs) in y. That "straight line" shape is the whole game in Topic 2.5, because the correlation coefficient r only measures linear association. The CED is explicit about this. r gives the direction (positive or negative) and quantifies the strength of the linear association between two quantitative variables, and nothing else.

Here's the part that trips people up. A linear relationship is a shape you see; correlation is a number that measures that shape. r = 0.95 doesn't prove the relationship is linear, and the CED says exactly that: a correlation close to 1 or -1 does not necessarily mean a linear model is appropriate. A curved pattern can still produce a high r. So you always check the scatterplot (and later, the residual plot) before trusting r. Think of r as a ruler that only knows how to measure straightness. If the data isn't straight, the ruler gives you a number anyway, and that number lies.

Why Linear Relationship matters in AP Statistics

Linear relationships live in Unit 2 (Exploring Two-Variable Data), specifically Topic 2.5 on correlation. Two learning objectives hang directly on this idea. AP Stats 2.5.A asks you to determine the correlation for a linear relationship, and AP Stats 2.5.B asks you to interpret it. Everything downstream in Unit 2 depends on it too. Least-squares regression lines, residuals, and predictions all assume the relationship is linear in the first place. If you can't recognize when a relationship is linear (and when it isn't), every interpretation you write afterward is built on sand. This idea even resurfaces in Unit 9, where inference for slopes requires a linear relationship as a condition.

How Linear Relationship connects across the course

Correlation Coefficient (Unit 2)

r is the number that summarizes a linear relationship. It's unit-free, always between -1 and 1, and r = 0 means no linear association (the variables could still be related in a curved way). Linear relationship is the pattern; r is its scorecard.

Scatter Plot (Unit 2)

The scatterplot is where you actually see whether a relationship is linear. The AP exam loves data that looks strong but curved, like the bullfrog length-vs-mass data on the 2022 FRQ, precisely because r alone can't catch it.

Regression Line (Unit 2)

Once you've confirmed a relationship is linear, the least-squares regression line is the straight-line model you fit to it. Using a regression line on nonlinear data gives you predictions that systematically miss, which is what residual plots expose.

Z-scores (Unit 1)

The formula for r is essentially the average product of paired z-scores, r = (1/(n-1)) Σ(zx · zy). That's why r is unit-free. Standardizing strips away the units, so r measures pure linear pattern, not scale.

Is Linear Relationship on the AP Statistics exam?

Multiple-choice questions usually hand you an r value and ask which interpretation is correct. The traps are predictable. They'll tempt you with causal language (a question about r = -0.85 between study time and anxiety wants you to say "strong negative linear association," not "studying reduces anxiety"), or claim r = 0 means "no relationship" when it only means no linear relationship. You may also see influential points, like a question where removing one point drops r from 0.72 to 0.45, which tests whether you know single points can inflate apparent linear strength.

On FRQs, two-variable data shows up constantly. The 2018 FRQ had you describe the relationship between checkout-line customers and checkout time, and the 2022 FRQ used bullfrog length and mass. When asked to describe a relationship, hit all four pieces: direction, form (this is where you say "linear" or "nonlinear"), strength, and unusual points, all in context. Never say "correlation" to describe a curved pattern, and never let "strong" slide into "causes."

Linear Relationship vs Correlation

A linear relationship is the straight-line pattern in the data; correlation is the statistic (r) that measures how strong and in what direction that pattern runs. The order matters. You confirm linearity from the scatterplot first, then use r to quantify it. Going the other way fails, because a high r can come from clearly curved data, and the CED flags this exact trap.

Key things to remember about Linear Relationship

  • A linear relationship means the points on a scatterplot follow a straight-line pattern, so equal changes in x go with roughly equal changes in y.

  • The correlation coefficient r measures only the direction and strength of a linear association, and it's always between -1 and 1.

  • An r close to 1 or -1 does not prove the relationship is linear; you must check the scatterplot because curved data can still produce a high r.

  • r = 0 means no linear association, but the variables can still have a strong nonlinear relationship.

  • Correlation does not imply causation, so even a strong linear relationship never proves that one variable causes changes in the other.

  • When an FRQ asks you to describe a relationship, address direction, form, strength, and unusual points in context.

Frequently asked questions about Linear Relationship

What is a linear relationship in AP Stats?

It's an association between two quantitative variables where the scatterplot pattern follows a straight line. In Topic 2.5, the correlation coefficient r quantifies the direction and strength of that linear pattern, with values from -1 to 1.

Does a high correlation mean the relationship is linear?

No. The CED states directly that r close to 1 or -1 does not necessarily mean a linear model is appropriate. Curved data can produce a high r, so you always confirm form by looking at the scatterplot.

Does r = 0 mean there's no relationship between the variables?

No, it only means there's no linear relationship. A perfect parabola, for example, can have r = 0 even though x and y are clearly related. This distinction is a favorite multiple-choice trap.

What's the difference between a linear relationship and correlation?

The linear relationship is the straight-line pattern you observe in the data; correlation (r) is the number that measures how strong and which direction that pattern is. r is meaningless as a summary if the relationship isn't actually linear.

If two variables have a strong linear relationship, does one cause the other?

No. Correlation does not imply causation, no matter how strong the linear pattern is. A correlation of r = -0.85 between study time and anxiety describes an association, not proof that studying lowers anxiety. Only a well-designed experiment can support a causal claim.