ap stats study guides

⚖️  Unit 6 - Inference for Categorical Data: Proportions

😼  Unit 7 - Inference for Qualitative Data: Means

✳️  Unit 8 Inference for Categorical Data: Chi-Square

📈  Unit 9 - Inference for Quantitative Data: Slopes

🧐  Multiple Choice Questions (MCQs)

2.6 Linear Regression Models




⏱️  2 min read

written by

Peter Cao

peter cao

What is a LSRL?

The simplest form of regression is linear regression where we find a linear equation of the form ŷ=a+bx, where a is the y-intercept and b is the slope. Given a scatterplot, there can be infinitely many linear regression approximations, but there is only one best linear regression model, and this is called the least squares regression line (LSRL). We will discuss this and how to find the formula in a couple of sections.

It is also important to note here that our ŷ represents our predicted response variable value, while x represents our explanatory value variable. Since x is given in a data set, it is not necessarily predicted, but our y-value is always predicted from a least squares regression line.


Given a linear regression equation, we can find values of the response variable for values of the explanatory variable not given in the data set. This is more accurate if you stay in the domain of the set of values of the explanatory variable. If you go outside this domain, it is now called extrapolation, and farther outside the domain you go, the less accurate your predictions will be.


image courtesy of: statsforstem.org comfort


In a recent model built using data for 19-24 year olds, a least squares regression line is developed that says that an individual's comfort level with technology (on a scale of 1-10) can be predicted using the least squares regression line: ŷ=0.32x+0.67, where ŷ is the predicted comfort level and x represents one's age.

Predict what the comfort level would be of a 45 year old and why this response does not make sense.


ŷ=15.07 comfort level

This answer does not make sense because we would expect our predicted comfort levels to be between 1 and 10. 15.07 does not make sense. The reason why we have this response is because we were using a data set intended for 19-24 year olds to make inference about a 45 year old. We were extrapolating our data to include someone outside our data set, which is not a good idea. 🙅‍♂️💡

🎥Watch: AP Stats - Least Squares Regression Lines

continue learning

Slide 1 of 11

Join Our Community

Fiveable Community students are already meeting new friends, starting study groups, and sharing tons of opportunities for other high schoolers. Soon the Fiveable Community will be on a totally new platform where you can share, save, and organize your learning links and lead study groups among other students!🎉

Fiveable Logo

2550 north lake drive
suite 2
milwaukee, wi 53211

92% of Fiveable students earned a 3 or higher on their 2020 AP Exams.

*ap® and advanced placement® are registered trademarks of the college board, which was not involved in the production of, and does not endorse, this product.

© fiveable 2020 | all rights reserved.