Fiveable

💯Math for Non-Math Majors Unit 8 Review

QR code for Math for Non-Math Majors practice questions

8.8 Scatter Plots, Correlation, and Regression Lines

8.8 Scatter Plots, Correlation, and Regression Lines

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
💯Math for Non-Math Majors
Unit & Topic Study Guides

Scatter plots help us visualize relationships between two variables. By plotting points on a graph, we can see patterns and trends in data. This visual representation is crucial for understanding how different factors might be connected.

Correlation coefficients and regression lines take scatter plots a step further. These tools let us quantify relationships and make predictions based on data. Understanding these concepts helps us interpret real-world information and make informed decisions.

Scatter Plots and Correlation

Creation of scatter plots

  • Visualize relationships between two quantitative variables by plotting data points on a coordinate plane
    • Each point represents a single observation (student's height and weight)
    • Independent variable plotted on x-axis (hours studied)
    • Dependent variable plotted on y-axis (exam score)
  • Construct scatter plots by hand or using technology (Excel, graphing calculator)
  • Choose appropriate scales for x and y axes to accurately represent data
  • Label axes clearly with variable names and units (time in minutes, distance in kilometers)
Creation of scatter plots, Unit 2: Represent data using a scatter plot – National Curriculum (Vocational) Mathematics Level 4

Interpretation of correlation coefficients

  • Correlation coefficient (rr) quantifies strength and direction of linear relationships between variables
    • rr ranges from -1 to 1
      • r=1r = 1: perfect positive linear relationship (income and education level)
      • r=1r = -1: perfect negative linear relationship (car's value and age)
      • r=0r = 0: no linear relationship (shoe size and IQ)
    • Stronger linear relationships indicated by r|r| values closer to 1 (0.9 vs 0.2)
  • Positive rr: variables increase together (hours of exercise and cardiovascular health)
  • Negative rr: one variable increases as the other decreases (product price and demand)
  • Coefficient of determination (r2r^2) is proportion of variation in dependent variable explained by independent variable
    • r2r^2 ranges from 0 to 1
    • r2=0.81r^2 = 0.81: 81% of variation in test scores explained by study time
  • Correlation does not imply causation; other factors may influence the relationship
Creation of scatter plots, 12.3: The Regression Equation | Introduction to Statistics

Regression Lines

Regression lines for predictions

  • Regression line (least-squares line) best fits data points in scatter plot
    • Minimizes sum of squared vertical distances between points and line
  • Equation of regression line: y^=mx+b\hat{y} = mx + b
    • y^\hat{y}: predicted value of dependent variable
    • mm: slope, change in y^\hat{y} per one-unit increase in xx
    • bb: y-intercept, y^\hat{y} value when x=0x = 0
  • Make predictions by substituting xx value into equation and solving for y^\hat{y}
    • Predict test score (y^\hat{y}) for 5 hours of studying (xx): y^=10+5(5)=35\hat{y} = 10 + 5(5) = 35
  • Interpret slope in context of problem
    • Slope of 1.5 with dependent variable of sales (thousands) and independent variable of advertising expenditure (thousands): each 1,000increaseinadvertisingassociatedwith1,000 increase in advertising associated with 1,500 increase in sales

Analyzing Regression Models

  • Interpolation: Making predictions within the range of observed data
  • Extrapolation: Making predictions outside the range of observed data (less reliable)
  • Residual: Difference between observed and predicted values, used to assess model fit
  • Variance: Measure of spread in data points, affects reliability of regression model