Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
When you build a forecasting model, the real question isn't just "does it work?"—it's "how wrong is it, and in what ways?" That's where accuracy metrics come in. You're being tested on your ability to select the right metric for different forecasting scenarios, interpret what the numbers actually mean, and explain why one model outperforms another. These metrics connect directly to core concepts like model selection, overfitting, scale sensitivity, and error distribution.
The key insight is that no single metric tells the whole story. Some metrics punish large errors harshly, others express accuracy as percentages for easy interpretation, and still others compare your model against a baseline. Don't just memorize formulas—know what each metric reveals about your forecast's strengths and weaknesses, and when you'd choose one over another.
These metrics measure forecast error in the original units of your data, making them intuitive to interpret. They calculate the typical size of your prediction mistakes without worrying about direction—just magnitude.
Compare: MAE vs. RMSE—both measure error in original units, but RMSE punishes large errors more severely. If your RMSE is much higher than your MAE, you have some big misses hiding in your data. Use MAE when all errors matter equally; use RMSE when large errors are especially problematic.
These metrics express error as a percentage, making them useful for comparing accuracy across datasets with different scales. The tradeoff: they can behave strangely when actual values are near zero.
Compare: MAPE vs. SMAPE—both give percentage-based accuracy, but SMAPE handles near-zero values more gracefully. Choose MAPE for intuitive reporting when values are safely above zero; switch to SMAPE when your data includes small or intermittent values.
These metrics compare your model's performance against a benchmark (usually a naive forecast), answering the question: "Is my model actually adding value, or could I have just used yesterday's value?"
Compare: MASE vs. Theil's U—both benchmark against naive forecasts, but MASE uses absolute errors while Theil's U uses squared errors. MASE is more robust to outliers; Theil's U provides richer diagnostic decomposition. If an exam question asks about relative forecast improvement, either works—but specify which baseline you're using.
These metrics evaluate how well your model explains variation in the data and help you choose between competing models. They're essential for avoiding overfitting—building a model that memorizes your training data but fails on new observations.
Compare: R² vs. Adjusted R²—R² rewards complexity blindly, while Adjusted R² asks whether added predictors actually improve the model. Always report Adjusted R² when comparing models with different numbers of predictors; plain R² is fine for single-model interpretation.
Compare: Adjusted R² vs. AIC—both penalize complexity, but Adjusted R² is easier to interpret (it's still a proportion of variance explained) while AIC is more theoretically grounded for maximum likelihood models. Use Adjusted R² for quick comparisons; use AIC when doing formal model selection in time series or regression contexts.
| Concept | Best Examples |
|---|---|
| Error in original units | MAE, RMSE |
| Penalizes large errors | MSE, RMSE |
| Percentage-based accuracy | MAPE, SMAPE |
| Comparison to naive baseline | MASE, Theil's U |
| Variance explained | R², Adjusted R² |
| Model selection with complexity penalty | AIC, Adjusted R² |
| Robust to near-zero values | SMAPE, MASE |
| Diagnoses error sources | Theil's U |
Your RMSE is significantly higher than your MAE for the same forecast. What does this tell you about your error distribution, and which metric would you report to a risk-averse stakeholder?
You're comparing forecast accuracy for two product lines—one with average sales of 10,000 units and another with average sales of 50 units. Which metric would give you a fair comparison, and what pitfall should you watch for?
A colleague's model has R² = 0.92, but when you calculate MASE, it's 1.15. How do you explain this apparent contradiction, and which metric should guide your model selection?
Compare and contrast AIC and Adjusted R² as tools for preventing overfitting. When would you prefer one over the other?
You need to evaluate whether a new forecasting model is worth implementing over your current simple moving average. Which two metrics would best support your recommendation, and what threshold values would indicate success?