ap stats study guides

⚖️  Unit 6 - Inference for Categorical Data: Proportions

😼  Unit 7 - Inference for Qualitative Data: Means

✳️  Unit 8 Inference for Categorical Data: Chi-Square

📈  Unit 9 - Inference for Quantitative Data: Slopes

🧐  Multiple Choice Questions (MCQs)

2.5 Correlation

#correlation

#r-value

#s-value

⏱️  2 min read

written by

Peter Cao

peter cao


What is Correlation?

Correlation is when two variables are related to each other, and this is numerically represented with the correlation coefficient, which in stats we denote as r. The correlation coefficient shows the degree to which there is a linear correlation between the two variables, that is, how close the points are to forming a line. It can be positive or negative and this is the same as the direction of the scatterplot. The coefficient takes a value between -1 and 1, where r=-1 means that the points fall exactly on an decreasing line while r=1 means that the points fall exactly on a increasing line. A correlation coefficient of 0 means that there is no correlation between the data points.

Examples

Here are some scatterplots and their values of r:

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2FScreen%20Shot%202020-04-24%20at%207.51-KGen8QXyUFXW.png?alt=media&token=e90556cd-49ec-4c6d-ac07-1ffe900b1dd0

image courtesy of: math.nayland.school.nz

Also, there are a few things to keep in mind about correlation. 

  • Even if r has a high magnitude, the relationship may not be linear, but instead it may be curved. We will discuss this more in later sections.

  • A high magnitude of correlation does not imply causation.

  • The correlation coefficient is not resistant to outliers, which makes sense, given that the formula that we shall learn uses the mean and standard deviation, which by themselves are not resistant.

Calculating the Correlation Coefficient

To find the value of r, we have this formula that is found on the formula sheet:

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2FScreen%20Shot%202020-04-24%20at%207.53-wJpfoxjCtbL2.png?alt=media&token=f5b31b81-9120-44ba-96fc-5f2848adc13a

Although this may seem like a complicated formula, it’s not that bad to understand (but harder to compute) To find r, first find the mean and standard deviations of both the x and y variables. Then, for each data point, multiply the x and y z-scores for that point. Finally, add all the individual products up and divide by the number of data points minus 1.

You will seldom need to do this by hand, and most graphing calculators can easily find this. On the most common graphing calculator used in AP Stats (TI-84), you will enter your data into L1 and L2, go to Stats>Calc>LinReg like below:

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2FScreen%20Shot%202020-04-25%20at%2011.11-F3qYYS7Efe7p.png?alt=media&token=b023f6cf-2396-4032-b8a8-b19dadbac5c5

To be sure that you get the r-value, verify that "Stats Diagnostics" is on via MODE.

🎥Watch: AP Stats - Scatterplots and Association

continue learning

Slide 1 of 11
Fiveable

Join Our Community

Fiveable Community students are already meeting new friends, starting study groups, and sharing tons of opportunities for other high schoolers. Soon the Fiveable Community will be on a totally new platform where you can share, save, and organize your learning links and lead study groups among other students!🎉

Fiveable Logo

2550 north lake drive
suite 2
milwaukee, wi 53211

92% of Fiveable students earned a 3 or higher on their 2020 AP Exams.

*ap® and advanced placement® are registered trademarks of the college board, which was not involved in the production of, and does not endorse, this product.

© fiveable 2020 | all rights reserved.