Fiveable

🎲Data, Inference, and Decisions Unit 7 Review

QR code for Data, Inference, and Decisions practice questions

7.4 Coefficient of determination and model evaluation

7.4 Coefficient of determination and model evaluation

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🎲Data, Inference, and Decisions
Unit & Topic Study Guides

Linear regression models help us understand relationships between variables. The coefficient of determination (R²) tells us how well our model fits the data. It measures the proportion of variance explained by the model.

While R² is useful, it has limitations. We'll explore adjusted R² and residual analysis to get a more complete picture of model fit. These tools help us evaluate and improve our regression models.

Coefficient of Determination for Model Fit

Understanding R-squared

  • Coefficient of determination (R²) measures proportion of variance in dependent variable predictable from independent variable(s) in regression model
  • R² ranges from 0 to 1
    • 0 indicates model explains no variability in data
    • 1 indicates perfect prediction
  • Calculate R² using formula R2=1Sum of Squared ResidualsTotal Sum of SquaresR² = 1 - \frac{\text{Sum of Squared Residuals}}{\text{Total Sum of Squares}}
  • Interpret R² as percentage of variation in response variable explained by model
  • Higher R² value suggests better model fit to data
    • Does not imply causation or model appropriateness
  • Use R² to compare explanatory power of different regression models on same dataset

Applications and Considerations

  • R² increases as more predictors added to multiple regression model
    • May not reflect actual improvement in predictive power
  • Utilize R² for various purposes
    • Assessing model performance (weather prediction models)
    • Comparing different statistical models (economic forecasting)
    • Evaluating goodness of fit in scientific research (drug efficacy studies)
  • Consider R² alongside other model evaluation metrics
    • Mean squared error (MSE)
    • Root mean squared error (RMSE)
    • Akaike Information Criterion (AIC)

Limitations of R-squared

Potential Misinterpretations

  • R² artificially inflates by adding more variables to model
    • Does not guarantee improved predictive power
  • R² fails to indicate
    • Coefficient bias
    • Correct regression model usage
    • Cause-and-effect relationships between variables
  • R² exhibits sensitivity to outliers and influential points
    • Can significantly affect value
  • R² limitations in various scenarios
    • Overfitting in small sample sizes
    • Misleading in non-linear relationships
    • Inadequate for comparing models with different dependent variables
Understanding R-squared, Coefficient of determination - Wikipedia

Adjusted R-squared

  • Adjusted R² modifies R² to account for number of predictors in model
  • Calculate adjusted R² using formula Adjusted R²=1(1R2)(n1)nk1\text{Adjusted R²} = 1 - \frac{(1 - R²)(n - 1)}{n - k - 1}
    • n represents sample size
    • k represents number of predictors
  • Adjusted R² penalizes addition of unnecessary predictors
    • Provides more accurate measure of model fit in multiple regression
  • Adjusted R² can decrease when irrelevant predictors added to model
    • Useful for comparing models with different numbers of predictors
  • Examples of adjusted R² applications
    • Selecting optimal number of variables in stepwise regression
    • Evaluating feature importance in machine learning models

Residual Analysis for Model Fit

Residual Plot Interpretation

  • Residual analysis examines differences between observed and predicted values (residuals) to assess model adequacy
  • Residual plot displays residuals on y-axis against fitted values or predictor variables on x-axis
  • Well-fitting model characteristics
    • Residuals randomly scattered around zero
    • No discernible pattern in residual plot
  • Assess homoscedasticity by checking for constant variance in residuals across predictor variable levels
  • Evaluate normality of residuals using
    • Q-Q plot (should approximate straight line)
    • Histogram of residuals (should resemble normal distribution)
  • Identify outliers and influential points through residual analysis
    • Use measures like standardized residuals, Cook's distance, or leverage statistics
  • Detect non-linearity in variable relationships
    • Indicates linear model may be inappropriate
  • Examples of residual patterns and interpretations
    • Funnel shape: heteroscedasticity
    • U-shape: non-linear relationship
    • Clustering: presence of subgroups in data

Advanced Residual Analysis Techniques

  • Durbin-Watson test detects autocorrelation in residuals
    • Violates independence assumption in linear regression
  • Partial residual plots assess individual predictor effects while controlling for other variables
  • Standardized residual plots help identify outliers and influential observations
  • LOESS smoothing on residual plots reveals subtle patterns and trends
  • Residual analysis applications in various fields
    • Financial modeling (stock price prediction)
    • Environmental science (pollution level forecasting)
    • Medical research (drug effectiveness studies)
Pep mascot
Upgrade your Fiveable account to print any study guide

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Click below to go to billing portal → update your plan → choose Yearly → and select "Fiveable Share Plan". Only pay the difference

Plan is open to all students, teachers, parents, etc
Pep mascot
Upgrade your Fiveable account to export vocabulary

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Plan is open to all students, teachers, parents, etc
report an error
description

screenshots help us find and fix the issue faster (optional)

add screenshot

2,589 studying →