# 2.5 Correlation

#correlation

#r-value

#s-value

written by peter cao

## What is Correlation?

Correlation is when two variables are related to each other, and this is numerically represented with the correlation coefficient, which in stats we denote as r. The correlation coefficient shows the degree to which there is a linear correlation between the two variables, that is, how close the points are to forming a line. It can be positive or negative and this is the same as the direction of the scatterplot. The coefficient takes a value between -1 and 1, where r=-1 means that the points fall exactly on an decreasing line while r=1 means that the points fall exactly on a increasing line. A correlation coefficient of 0 means that there is no correlation between the data points.

### Examples

Here are some scatterplots and their values of r: image courtesy of: math.nayland.school.nz

Also, there are a few things to keep in mind about correlation.

• Even if r has a high magnitude, the relationship may not be linear, but instead it may be curved. We will discuss this more in later sections.

• A high magnitude of correlation does not imply causation.

• The correlation coefficient is not resistant to outliers, which makes sense, given that the formula that we shall learn uses the mean and standard deviation, which by themselves are not resistant.

## Calculating the Correlation Coefficient

To find the value of r, we have this formula that is found on the formula sheet: Although this may seem like a complicated formula, it’s not that bad to understand (but harder to compute) To find r, first find the mean and standard deviations of both the x and y variables. Then, for each data point, multiply the x and y z-scores for that point. Finally, add all the individual products up and divide by the number of data points minus 1.

You will seldom need to do this by hand, and most graphing calculators can easily find this. On the most common graphing calculator used in AP Stats (TI-84), you will enter your data into L1 and L2, go to Stats>Calc>LinReg like below: To be sure that you get the r-value, verify that "Stats Diagnostics" is on via MODE.

🎥Watch: AP Stats - Scatterplots and Association