Fiveable

Intro to Time Series Unit 8 Review

QR code for Intro to Time Series practice questions

8.4 Combining forecasts

8.4 Combining forecasts

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
Intro to Time Series
Unit & Topic Study Guides

Combining Forecasts

Combining forecasts takes predictions from multiple models and merges them into a single, more reliable forecast. The core idea is straightforward: different models capture different patterns in your data, so blending them tends to outperform any single model on its own. This section covers why combination works, how to do it, and how to evaluate the result.

Rationale for Forecast Combination

No single time series model is perfect. An ARIMA model might capture trend and autocorrelation well but miss nonlinear patterns. Exponential smoothing might adapt quickly to level shifts but struggle with seasonal complexity. When you combine forecasts, you're hedging your bets across these strengths and weaknesses.

  • Accuracy gains: Combined forecasts consistently outperform individual models in empirical studies, often by a meaningful margin.
  • Reduced model risk: If one model is misspecified or fits noise rather than signal, the combination dilutes that error with better-performing models.
  • Robustness: The combined forecast tends to be less sensitive to outliers, sudden spikes, or unusual patterns that might throw off a single model.
Rationale for forecast combination, Time Series Analysis

Simple Averaging of Forecasts

The most basic combination method is to take the arithmetic mean of all your model forecasts:

Combined Forecast=1ni=1nForecasti\text{Combined Forecast} = \frac{1}{n} \sum_{i=1}^{n} \text{Forecast}_i

where nn is the number of models (e.g., ARIMA, exponential smoothing, a regression model).

This assigns equal weight to every model, assuming each contributes equally. It sounds almost too simple, but research has shown that simple averaging is surprisingly hard to beat, especially when you don't have a long track record to judge which model is best. It's also trivial to implement and easy to explain.

Rationale for forecast combination, Better prediction intervals for time series forecasts

Weighted Averaging Techniques

When you have enough historical performance data, you can give better-performing models more influence in the combination. The general formula is:

Combined Forecast=i=1nwi×Forecasti\text{Combined Forecast} = \sum_{i=1}^{n} w_i \times \text{Forecast}_i

where the weights wiw_i sum to 1. The question is how to set those weights.

Inverse MSE Weighting

This approach rewards models that have produced smaller errors in the past:

  1. Calculate MSEi\text{MSE}_i for each model ii on a holdout or validation set.
  2. Compute each weight as wi=1/MSEij=1n1/MSEjw_i = \frac{1/\text{MSE}_i}{\sum_{j=1}^{n} 1/\text{MSE}_j}
  3. Models with lower MSE get higher weights. For example, if Model A has an MSE of 4 and Model B has an MSE of 16, Model A receives 4 times the weight of Model B.

AIC Weighting

This method uses the Akaike Information Criterion, which balances fit and model complexity (parsimony):

  1. Calculate AICi\text{AIC}_i for each model ii.
  2. Compute each weight as wi=exp(0.5×AICi)j=1nexp(0.5×AICj)w_i = \frac{\exp(-0.5 \times \text{AIC}_i)}{\sum_{j=1}^{n} \exp(-0.5 \times \text{AIC}_j)}
  3. Models with lower AIC (better fit relative to their complexity) receive higher weights.

AIC weighting is useful because it penalizes overly complex models, not just raw forecast error. However, both weighting schemes assume that past performance is a reasonable guide to future accuracy.

Accuracy of Combined vs. Individual Forecasts

After building a combined forecast, you need to check whether it actually improves on the individual models. Here's how to evaluate:

  • Calculate standard error metrics for both the combined forecast and each individual model: Mean Absolute Error (MAE), Mean Squared Error (MSE), and Mean Absolute Percentage Error (MAPE).
  • Compare across models. If the combined forecast has a lower MAE or MAPE than every individual model, the combination is adding value. If it doesn't, your weighting scheme may need adjustment, or one dominant model may be sufficient.
  • Test across different periods. Check performance on training, validation, and test sets separately. A combined forecast that only wins on the training set but not the test set may be overfitting the weights.
  • Consider the cost. Combining forecasts means maintaining and running multiple models, which takes more computational time and effort. If the accuracy gain is marginal, a single well-chosen model might be the more practical choice.

The general finding in the forecasting literature is that combinations almost always help, and simple averages are a strong default when you're unsure how to weight.