Fiveable

📈Intro to Probability for Business Unit 11 Review

QR code for Intro to Probability for Business practice questions

11.2 Simple Linear Regression Model

📈Intro to Probability for Business
Unit 11 Review

11.2 Simple Linear Regression Model

Written by the Fiveable Content Team • Last updated September 2025
Written by the Fiveable Content Team • Last updated September 2025
📈Intro to Probability for Business
Unit & Topic Study Guides

Linear regression is a powerful statistical tool for understanding relationships between variables. It helps us predict one variable based on another, using a simple equation that captures their connection. This method is crucial for business decisions, from sales forecasting to understanding customer behavior.

The key components of linear regression include the slope, y-intercept, and error term. By interpreting these elements and assessing the model's fit through R-squared values, we can gauge how well our predictions match reality and make informed business choices.

Components and Interpretation of Simple Linear Regression

Components of linear regression

  • Simple linear regression model expressed as $y = \beta_0 + \beta_1x + \epsilon$
    • $y$ dependent variable (response variable) being predicted or explained
    • $x$ independent variable (explanatory variable) used to predict or explain changes in $y$
    • $\beta_0$ y-intercept, value of $y$ when $x$ equals zero
    • $\beta_1$ slope, change in $y$ for a one-unit increase in $x$
    • $\epsilon$ random error term, accounts for variability in $y$ not explained by linear relationship with $x$

Interpretation of slope vs y-intercept

  • Slope ($\beta_1$) change in dependent variable ($y$) for one-unit increase in independent variable ($x$)
    • Interpretation depends on context and units of variables
      • Sales ($y$) and advertising expenditure ($x$), slope of 50 means $1,000 increase in advertising leads to $50 increase in sales
  • Y-intercept ($\beta_0$) value of dependent variable ($y$) when independent variable ($x$) equals zero
    • Interpretation depends on context and whether $x = 0$ is meaningful
      • Number of employees ($x$), $\beta_0$ might not have practical interpretation, as company cannot have zero employees

Equation and Prediction in Simple Linear Regression

Equation of regression models

  • Least squares method estimates slope ($\beta_1$) and y-intercept ($\beta_0$) from data points
    • Calculate slope: $\beta_1 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n}(x_i - \bar{x})^2}$
      • $x_i$ and $y_i$ individual data points
      • $\bar{x}$ and $\bar{y}$ means of $x$ and $y$
      • $n$ number of data points
    • Calculate y-intercept: $\beta_0 = \bar{y} - \beta_1\bar{x}$
  • Substitute estimated slope and y-intercept into simple linear regression model equation: $\hat{y} = \beta_0 + \beta_1x$
    • $\hat{y}$ predicted value of dependent variable

Predictions from regression equations

  • Use estimated simple linear regression model equation $\hat{y} = \beta_0 + \beta_1x$ to predict value of dependent variable ($\hat{y}$) for given value of independent variable ($x$)
    1. Substitute given value of $x$ into equation
    2. Calculate predicted value of $\hat{y}$
      • Estimated regression equation $\hat{y} = 100 + 50x$ and $x = 2$, predicted value of $\hat{y}$ is $\hat{y} = 100 + 50(2) = 200$

Goodness of Fit in Simple Linear Regression

Goodness of fit assessment

  • Assess goodness of fit using coefficient of determination (R-squared)
    • R-squared proportion of variance in dependent variable ($y$) predictable from independent variable ($x$)
    • Formula: $R^2 = \frac{SSR}{SST} = 1 - \frac{SSE}{SST}$
      • $SSR$ sum of squares regression (explained variation)
      • $SSE$ sum of squares error (unexplained variation)
      • $SST$ total sum of squares (total variation)

Meaning of R-squared values

  • R-squared ranges from 0 to 1, higher values indicate better fit, lower values indicate poorer fit
    • R-squared of 0 none of variance in $y$ explained by $x$
    • R-squared of 1 all of variance in $y$ explained by $x$
    • R-squared of 0.75 means 75% of variance in dependent variable explained by independent variable, 25% unexplained