Linear regression is a powerful statistical tool for understanding relationships between variables. It helps us predict one variable based on another, using a simple equation that captures their connection. This method is crucial for business decisions, from sales forecasting to understanding customer behavior.
The key components of linear regression include the slope, y-intercept, and error term. By interpreting these elements and assessing the model's fit through R-squared values, we can gauge how well our predictions match reality and make informed business choices.
Components and Interpretation of Simple Linear Regression
Components of linear regression
- Simple linear regression model expressed as
- dependent variable (response variable) being predicted or explained
- independent variable (explanatory variable) used to predict or explain changes in
- y-intercept, value of when equals zero
- slope, change in for a one-unit increase in
- random error term, accounts for variability in not explained by linear relationship with
Interpretation of slope vs y-intercept
- Slope () change in dependent variable () for one-unit increase in independent variable ()
- Interpretation depends on context and units of variables
- Sales () and advertising expenditure (), slope of 50 means 50 increase in sales
- Interpretation depends on context and units of variables
- Y-intercept () value of dependent variable () when independent variable () equals zero
- Interpretation depends on context and whether is meaningful
- Number of employees (), might not have practical interpretation, as company cannot have zero employees
- Interpretation depends on context and whether is meaningful

Equation and Prediction in Simple Linear Regression
Equation of regression models
- Least squares method estimates slope () and y-intercept () from data points
- Calculate slope:
- and individual data points
- and means of and
- number of data points
- Calculate y-intercept:
- Calculate slope:
- Substitute estimated slope and y-intercept into simple linear regression model equation:
- predicted value of dependent variable

Predictions from regression equations
- Use estimated simple linear regression model equation to predict value of dependent variable () for given value of independent variable ()
- Substitute given value of into equation
- Calculate predicted value of
- Estimated regression equation and , predicted value of is
Goodness of Fit in Simple Linear Regression
Goodness of fit assessment
- Assess goodness of fit using coefficient of determination (R-squared)
- R-squared proportion of variance in dependent variable () predictable from independent variable ()
- Formula:
- sum of squares regression (explained variation)
- sum of squares error (unexplained variation)
- total sum of squares (total variation)
Meaning of R-squared values
- R-squared ranges from 0 to 1, higher values indicate better fit, lower values indicate poorer fit
- R-squared of 0 none of variance in explained by
- R-squared of 1 all of variance in explained by
- R-squared of 0.75 means 75% of variance in dependent variable explained by independent variable, 25% unexplained