ap stats study guides

⚖️  Unit 6 - Inference for Categorical Data: Proportions

😼  Unit 7 - Inference for Qualitative Data: Means

✳️  Unit 8 Inference for Categorical Data: Chi-Square

📈  Unit 9 - Inference for Quantitative Data: Slopes

🧐  Multiple Choice Questions (MCQs)

2.4 Representing the Relationship Between Two Quantitative Variables

#exploringdata

#anticipatingpatterns

#bivariatedata

⏱️  3 min read

written by

Peter Cao

peter cao


We can also have two sets of quantitative data to be compared as well, called a bivariate quantitative data set. Usually we have an independent, or explanatory variable, and a dependent, or response variable. That is, an explanatory variable explains a response to the response variable.

What is a Scatterplot?❓

We can organize this data into scatterplots, which is a graph of the data. On the horizontal axis (also called the x-axis) is the explanatory variable and on the vertical axis is the response variable. The explanatory variable is also known as the independent variable, while the response variable is the dependent variable. Here are two examples below:

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2FScreen%20Shot%202020-04-24%20at%205.26-EpW5XXXnlbId.png?alt=media&token=49a659e7-ea13-407a-8bd8-2a240aa0ce46

Graph 1

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2FScreen%20Shot%202020-04-24%20at%205.27-7rf54CmPjaVh.png?alt=media&token=d45df93d-8079-46de-a9a8-eb678c04f61f

Graph 2

Both images courtesy of: Starnes, Daren S. and Tabor, Josh. The Practice of Statistics—For the AP Exam, 5th Edition. Cengage Publishing.

Describing Scatterplots🗒️

When given a scatterplot, we are often asked to describe them. In AP Statistics, there are four things graders are looking for when asked to describe a scatterplot, or describe the correlation in a scatterplot.

Form

The form of a scatterplot is the general shape given by the scatterplot. This is usually either linear or curved. In the scatterplot above, Graph 1 is best described as curved, while Graph 2 is obviously linear.

Direction

The direction of the scatterplot is the general trend that you see when going left to right. Graph 1 is decreasing as the values of the response variable tend to go down from left to right while graph 2 is increasing as the values of the response variable tend to go up from left to right. When describing the direction for a linear model, we can refer to it as positive or negative correlation, which comes from the slope of the line that would fit the data. If the slope appears to be positive, the correlation amongst the data is also positive.

Strength

The strength of a scatterplot describes how closely the points fit a certain model, and it can either be strong, moderate, or weak. How we figure this out numerically will be on the next section about correlation and the correlation coefficient. For this, Graph 1 shows a medium strength correlation while Graph 2 shows a strong strength correlation.

Unusual Features

Lastly, we have to discuss unusual features on a scatterplot. The two types you should know are clusters and outliers, which are similar to their single-variable counterparts.

Clusters are where points are clumped together on a scatterplot. Graph 1 has two clusters, one on the top left and the other on the top right. On the other hand, Graph 2 is more uniformly distributed.

An outlier is a point where there is a large discrepancy between the predicted response variable(y) value and the actual response variable(y) value.

Example

Describe the scatterplot in context of the problem.

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2FScreen%20Shot%202020-04-24%20at%205.28-bah8R1OswVWL.png?alt=media&token=828ab002-6dd2-4405-a244-ef64a8c903f9

Courtesy of Starnes, Daren S. and Tabor, Josh. The Practice of Statistics—For the AP Exam, 5th Edition. Cengage Publishing.

In the scatterplot above, we see that it appears to follow a linear pattern. It also shows a negative correlation since the Gesell score seems to decrease as the age at first word increases. The correlation appears to be moderate, since there are some points that follow the pattern exactly, while others seem to break apart from the pattern. The data appears to have one cluster with an outlier at Child 19, because the predicted Gesell Score for Child 19 (value at line) has a large discrepancy from the actual Gesell score (value at point). Also, the data has an influential point that is a high leverage point with Child 18 because it heavily influences the negative correlation of the data set.

**Notice that this response is IN CONTEXT of the problem. This is a great way to maximize your credit on the AP Statistics exam.

🎥Watch: AP Stats - Scatterplots and Association

continue learning

Slide 1 of 11
Fiveable

Join Our Community

Fiveable Community students are already meeting new friends, starting study groups, and sharing tons of opportunities for other high schoolers. Soon the Fiveable Community will be on a totally new platform where you can share, save, and organize your learning links and lead study groups among other students!🎉

Fiveable Logo

2550 north lake drive
suite 2
milwaukee, wi 53211

92% of Fiveable students earned a 3 or higher on their 2020 AP Exams.

*ap® and advanced placement® are registered trademarks of the college board, which was not involved in the production of, and does not endorse, this product.

© fiveable 2020 | all rights reserved.