Correlation coefficients tell you the strength and direction of a linear relationship between two variables, but how do you know if that relationship is real? A significance test answers that question by determining whether the observed correlation is statistically meaningful or could have occurred by random chance alone.

Two common approaches exist for testing significance: the p-value method and the critical value method. Both rely on the same assumptions and use the t-distribution, so they'll lead you to the same conclusion.

more resources to help you study

practice questions

Significance of Correlation Coefficients (P-Value Method)

The p-value method tests whether the sample correlation coefficient ( $r$ ) provides enough evidence to conclude that a true linear relationship exists in the population.

You start by setting up your hypotheses:

Null hypothesis ( $H_0$ ): No significant linear relationship exists between the two variables ( $\rho = 0$ )
Alternative hypothesis ( $H_1$ ): A significant linear relationship exists between the two variables ( $\rho \neq 0$ )

Here, $\rho$ (rho) is the population correlation coefficient. The null hypothesis claims the population correlation is zero, meaning any pattern you see in your sample is just noise.

The p-value represents the probability of obtaining a sample correlation as extreme as (or more extreme than) the one you observed, assuming $H_0$ is true. A small p-value means your result would be very unlikely under the null hypothesis.

If the p-value is less than your chosen significance level ( $\alpha$ , typically 0.05), reject $H_0$ . You have evidence of a statistically significant correlation. For example, you'd likely find a significant correlation between height and weight.
If the p-value is greater than or equal to $\alpha$ , fail to reject $H_0$ . There isn't enough evidence to conclude a significant linear relationship exists.
Rejecting $H_0$ when it's actually true is a Type I error. Your significance level $\alpha$ controls the probability of making this mistake.

Significance of correlation coefficients, File:P-value in statistical significance testing.svg - Wikimedia Commons

Critical Value Method for Correlation

The critical value method reaches the same conclusion as the p-value method but works by comparing a test statistic to a threshold from the t-distribution.

Steps:

Calculate the test statistic using:

$t = \frac{r\sqrt{n-2}}{\sqrt{1-r^2}}$

where $r$ is the sample correlation coefficient and $n$ is the sample size.

Find the degrees of freedom: $df = n - 2$
Look up the critical value ( $t_{crit}$ ) from a t-distribution table using your degrees of freedom and significance level. For a two-tailed test, use $\alpha/2$ for each tail.
Compare and decide:
- If $|t| > t_{crit}$ , reject $H_0$ . The correlation is statistically significant. (For instance, a study might find a significant correlation between age and blood pressure.)
- If $|t| \leq t_{crit}$ , fail to reject $H_0$ . There isn't sufficient evidence of a significant correlation. (You'd expect no significant correlation between shoe size and IQ.)

Failing to reject $H_0$ when it's actually false is a Type II error, meaning you missed a real relationship.

Quick example: Suppose you have $r = 0.55$ with $n = 20$ . Then $df = 18$ and $t = \frac{0.55\sqrt{18}}{\sqrt{1 - 0.3025}} = \frac{0.55 \times 4.243}{\sqrt{0.6975}} = \frac{2.334}{0.835} \approx 2.79$ . With $\alpha = 0.05$ (two-tailed), the critical value for 18 degrees of freedom is about 2.101. Since $2.79 > 2.101$ , you'd reject $H_0$ and conclude the correlation is significant.

Significance of correlation coefficients, Introduction to Hypothesis Testing | Concepts in Statistics

Assumptions in Correlation Testing

For the significance test to produce reliable results, four assumptions need to hold:

Linearity: The relationship between the two variables should be linear. Check this with a scatterplot. If the data curves or follows some other pattern (like an exponential trend), the correlation coefficient won't capture the relationship well, and the test results can be misleading.
Independence: Each observation should be independent of the others. Data points shouldn't influence one another. For example, test scores from students in separate classrooms (with no interaction) would satisfy this assumption, while repeated measurements on the same person over time might not.
Normality: Both variables should be approximately normally distributed. You can check this with histograms, Q-Q plots, or formal tests like the Shapiro-Wilk test. With large samples, this assumption becomes less critical due to the Central Limit Theorem, but with small samples it matters a lot.
Homoscedasticity: The spread of the data around the regression line should be roughly constant across all values of the independent variable. Check this with a residual plot (residuals vs. fitted values). If the residuals fan out in a funnel shape, homoscedasticity is violated, and your significance test may be unreliable.

If any of these assumptions are seriously violated, consider transforming the data or using non-parametric alternatives like Spearman's rank correlation.

Additional Considerations

Effect size: The correlation coefficient $r$ doubles as a measure of effect size. Values near $\pm 0.1$ are typically considered small, near $\pm 0.3$ moderate, and near $\pm 0.5$ or beyond large. A statistically significant correlation can still be weak in practical terms, especially with large samples.
Statistical power: This is the probability of correctly detecting a significant correlation when one truly exists (i.e., avoiding a Type II error). Power increases with larger sample sizes and stronger true correlations. With a very small sample, you might fail to detect even a moderately strong relationship.
Confidence intervals: A confidence interval for $\rho$ gives you a range of plausible values for the true population correlation. This is often more informative than a simple "significant or not" decision, because it shows both the direction and the precision of your estimate.