โณIntro to Time Series

Key Forecasting Methods

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Forecasting is the heart of time series analysis. It's why we study patterns, stationarity, and autocorrelation in the first place. You're being tested not just on how each method works, but on when to apply it. The key decision points involve understanding your data's characteristics: Does it have a trend? Seasonality? Is it stationary? Each forecasting method makes different assumptions, and matching the right tool to the right data structure is what separates a mediocre forecast from an accurate one.

These methods build on each other conceptually. Simple approaches like moving averages lay the groundwork for understanding how we weight past observations, while ARIMA models combine multiple techniques into a flexible framework. When you encounter exam questions, don't just memorize formulas. Know what data characteristics each method handles best and what assumptions it requires.


Smoothing-Based Methods

These methods reduce noise in your data by averaging or weighting past observations. The core principle is that random fluctuations cancel out when you combine multiple observations, revealing the underlying signal.

Moving Average (MA)

  • Averages a fixed window of past observations. The window size kk determines how much smoothing occurs. Larger windows produce smoother output but react more slowly to real changes in the data.
  • Simple MA uses equal weights; weighted MA assigns greater importance to recent observations, improving responsiveness.
  • Best for stationary data without strong trends. The method lags behind trend changes because it treats all windowed observations equally.

Exponential Smoothing

  • Applies exponentially decreasing weights to all past observations using smoothing parameter ฮฑ\alpha. Recent data matters most, but older data still contributes.
  • Simple exponential smoothing works for data with no trend or seasonality; double (Holt's) adds a trend component; triple (Holt-Winters) adds seasonality.
  • More adaptive than moving average because it responds quickly to changes while maintaining memory of the entire series history.

Compare: Moving Average vs. Exponential Smoothing: both smooth out noise, but MA uses a fixed window while exponential smoothing uses all past data with decaying weights. If asked which responds faster to sudden changes, exponential smoothing wins when ฮฑ\alpha is high.


Autoregressive Frameworks

These methods model the relationship between current values and past values directly. The underlying assumption is that time series exhibit persistence: what happened recently influences what happens next.

Autoregressive (AR) Models

  • Predicts future values as a linear combination of pp past values. The equation is Yt=c+ฯ•1Ytโˆ’1+ฯ•2Ytโˆ’2+...+ฯ•pYtโˆ’p+ฯตtY_t = c + \phi_1 Y_{t-1} + \phi_2 Y_{t-2} + ... + \phi_p Y_{t-p} + \epsilon_t
  • Model order pp is identified using the PACF (partial autocorrelation function). Significant spikes at specific lags indicate how many past values to include.
  • Requires stationarity. If your data has trends or changing variance, you must transform it first (through differencing, log transforms, etc.).

Moving Average (MA) Error Models

Don't confuse this with the smoothing-based moving average from earlier. In the ARIMA context, the MA component models the current value as depending on past forecast errors, not past observations. The equation is Yt=c+ฯตt+ฮธ1ฯตtโˆ’1+...+ฮธqฯตtโˆ’qY_t = c + \epsilon_t + \theta_1 \epsilon_{t-1} + ... + \theta_q \epsilon_{t-q}. The model order qq is identified using the ACF (autocorrelation function), where significant spikes indicate how many past error terms to include.

Autoregressive Integrated Moving Average (ARIMA)

  • Combines AR and MA components with differencing. The (p,d,q)(p, d, q) parameters specify autoregressive order, differencing order, and moving average order respectively.
  • Differencing (dd) transforms non-stationary data into stationary data by computing changes between consecutive observations. First-order differencing (d=1d = 1) means you compute Ytโˆ’Ytโˆ’1Y_t - Y_{t-1}.
  • The workhorse model for non-seasonal trending data. It's flexible enough to capture many real-world patterns when properly specified.

How to choose pp, dd, and qq:

  1. Plot the series and check for trends or non-constant mean. If present, difference the data (start with d=1d = 1) and re-check.
  2. Once the differenced series looks stationary, examine the ACF and PACF plots.
  3. Significant PACF spikes suggest the AR order pp. Significant ACF spikes suggest the MA order qq.
  4. Fit the model and check that residuals resemble white noise (no remaining autocorrelation).

Seasonal ARIMA (SARIMA)

  • Extends ARIMA with seasonal parameters (P,D,Q)s(P, D, Q)_s to capture patterns that repeat at fixed intervals (monthly, quarterly, yearly).
  • Seasonal differencing removes seasonal patterns just as regular differencing removes trends. For monthly data with yearly seasonality, seasonal differencing computes Ytโˆ’Ytโˆ’12Y_t - Y_{t-12}.
  • Full notation is ARIMA(p,d,q)(P,D,Q)s(p,d,q)(P,D,Q)_s. Exam questions often test whether you can identify which parameters address which data features: lowercase (p,d,q)(p, d, q) handle short-term dynamics, uppercase (P,D,Q)(P, D, Q) handle the seasonal structure, and ss is the number of periods per season.

Compare: ARIMA vs. SARIMA: both handle trends through differencing, but only SARIMA explicitly models repeating seasonal patterns. On an FRQ asking you to forecast monthly retail sales with clear December spikes, SARIMA is your answer.


Trend and Seasonal Decomposition

These approaches break complex time series into interpretable components. The principle is that observed data equals the combination of systematic patterns (trend, seasonality) plus random noise.

Trend Analysis

  • Identifies long-term directional movement in the data. The trend can be linear, exponential, or polynomial depending on the pattern.
  • Fitted using regression with time as the independent variable: Yt=ฮฒ0+ฮฒ1t+ฯตtY_t = \beta_0 + \beta_1 t + \epsilon_t for linear trends.
  • Critical first step in understanding your data. It determines whether differencing or detrending is needed before applying other methods.

Decomposition Methods

  • Separates the series into trend (TtT_t), seasonal (StS_t), and residual (RtR_t) components, making each pattern visible and analyzable on its own.
  • Additive decomposition assumes Yt=Tt+St+RtY_t = T_t + S_t + R_t; multiplicative assumes Yt=Ttร—Stร—RtY_t = T_t \times S_t \times R_t.
  • Choose additive when seasonal swings stay roughly constant in size over time. Choose multiplicative when seasonal variation grows (or shrinks) with the level of the series.

Compare: Additive vs. Multiplicative Decomposition: both extract the same components, but the relationship differs. If December sales are always $10,000 above average regardless of overall sales level, use additive. If December is always 20% above average, use multiplicative.


Trend-Seasonal Forecasting Methods

These methods explicitly model both trend and seasonality for direct forecasting. They extend smoothing concepts to handle the complexity of real-world business and economic data.

Holt-Winters Method

  • Triple exponential smoothing with three parameters: ฮฑ\alpha for level, ฮฒ\beta for trend, and ฮณ\gamma for seasonality.
  • Additive version for constant seasonal swings; multiplicative version for proportional seasonal effects. The choice here mirrors the additive vs. multiplicative decomposition logic.
  • Practical advantage: it updates all three components as new data arrives, making it ideal for rolling forecasts in business applications.

Prophet (Facebook's Forecasting Tool)

  • Additive regression model with components for trend, seasonality, and holiday effects: y(t)=g(t)+s(t)+h(t)+ฯตty(t) = g(t) + s(t) + h(t) + \epsilon_t
  • Handles missing data and outliers gracefully, making it robust to the messiness of real-world datasets.
  • Automatic changepoint detection identifies where trends shift, reducing the need for manual intervention in model specification.

Compare: Holt-Winters vs. Prophet: both handle trend and seasonality, but Holt-Winters requires cleaner data and manual parameter selection, while Prophet automates much of the process and handles irregularities. For exam purposes, know Holt-Winters mechanics; for applied projects, Prophet often wins.


Regression-Based Approaches

These methods frame forecasting as a relationship between variables. The core idea is that time series values depend on predictable factors that can be modeled explicitly.

Regression Analysis

  • Models YY as a function of predictor variables. For time series, time itself or lagged values can serve as predictors.
  • Linear form: Yt=ฮฒ0+ฮฒ1X1t+ฮฒ2X2t+ฯตtY_t = \beta_0 + \beta_1 X_{1t} + \beta_2 X_{2t} + \epsilon_t. Predictors can include a time trend, seasonal dummy variables, or external variables like temperature or ad spending.
  • Watch for autocorrelated residuals. Standard regression assumes independent errors, which time series data often violates. If the Durbin-Watson test or residual ACF shows autocorrelation, you'll need to correct for it (e.g., by adding lagged terms or switching to an ARIMA-based approach).

Compare: Regression vs. ARIMA: regression explicitly models relationships with external variables, while ARIMA models the internal dynamics of the series itself. Use regression when you have meaningful outside predictors; use ARIMA when the series' own history is your best information.


Quick Reference Table

ConceptBest Examples
Smoothing without trend/seasonalityMoving Average, Simple Exponential Smoothing
Smoothing with trendDouble Exponential Smoothing (Holt's Method)
Smoothing with trend + seasonalityHolt-Winters (Triple Exponential Smoothing)
Autoregressive modelingAR Models, ARIMA
Seasonal pattern modelingSARIMA, Holt-Winters, Prophet
DecompositionAdditive Decomposition, Multiplicative Decomposition
External variable forecastingRegression Analysis, Prophet (with regressors)
Handling messy real-world dataProphet

Self-Check Questions

  1. Which two methods both use weighted combinations of past observations but differ in how they assign weights? When would you prefer one over the other?

  2. You're given a time series with an upward trend and seasonal spikes every 12 months. Which methods from this guide could handle both features, and what parameters would you need to specify?

  3. Compare ARIMA and regression for time series forecasting. What assumption does regression make that ARIMA doesn't, and how might this cause problems with time series data?

  4. If decomposition reveals that seasonal fluctuations grow proportionally larger as the series level increases, which decomposition type should you use? Which Holt-Winters variant matches this pattern?

  5. An FRQ presents quarterly GDP data with a clear trend and asks you to specify an appropriate SARIMA model. What would the seasonal period ss be, and how would you determine whether seasonal differencing (DD) is needed?

Key Forecasting Methods to Know for Intro to Time Series