Fiveable

🔮Forecasting Unit 5 Review

QR code for Forecasting practice questions

5.4 Autoregressive Integrated Moving Average (ARIMA) Models

5.4 Autoregressive Integrated Moving Average (ARIMA) Models

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🔮Forecasting
Unit & Topic Study Guides

ARIMA models combine autoregressive, integrated, and moving average components to forecast time series data. They capture both short-term and long-term dependencies, making them versatile for various forecasting tasks.

The model structure is denoted as ARIMA(p,d,q), where p, d, and q represent the orders of autoregressive, differencing, and moving average terms. This framework allows for flexible modeling of complex time series patterns.

ARIMA Model Structure

General Notation and Assumptions

  • ARIMA models are denoted as ARIMA(p,d,q), where:
    • p represents the order of the autoregressive term
    • d represents the degree of differencing
    • q represents the order of the moving average term
  • ARIMA models assume that future values of a time series depend on:
    • Past values of the series (autoregressive component)
    • Past forecast errors (moving average component)
  • This structure allows ARIMA models to capture both short-term and long-term dependencies in the data

Autoregressive and Moving Average Components

  • The autoregressive component (AR) models the relationship between:
    • An observation
    • A certain number of lagged observations
  • The moving average component (MA) models the relationship between:
    • An observation
    • A residual error from a moving average model applied to lagged observations
  • The orders p and q determine the number of lag terms included in the AR and MA components, respectively

ARIMA Model Components

Autoregressive Component

  • The autoregressive (AR) component captures the linear dependence between:
    • An observation
    • A certain number of lagged observations
  • The order p determines the number of lag terms included in the AR component
    • Example: In an ARIMA(1,0,0) model, the current observation depends on the immediately preceding observation

Moving Average Component

  • The moving average (MA) component captures the linear dependence between:
    • An observation
    • A certain number of lagged forecast errors
  • The order q determines the number of lag terms included in the MA component
    • Example: In an ARIMA(0,0,1) model, the current observation depends on the immediately preceding forecast error
General Notation and Assumptions, Why time series forecasts prediction intervals aren't as good as we'd hope

Differencing Component

  • The differencing component (I) is used to remove non-stationarity in the data by computing differences between consecutive observations
  • The order d determines the number of times the differencing operation is applied
    • Example: First-order differencing (d=1) computes the difference between each observation and its preceding observation
  • Differencing helps to eliminate trends and seasonality in the data, making it suitable for ARIMA modeling

Seasonal ARIMA Models

  • ARIMA models can incorporate seasonal components, denoted as SARIMA(p,d,q)(P,D,Q)m, where:
    • P, D, and Q represent the seasonal autoregressive, differencing, and moving average terms, respectively
    • m represents the number of periods per season
  • Seasonal ARIMA models capture both non-seasonal and seasonal patterns in the data
    • Example: A SARIMA(1,1,1)(1,1,1)12 model for monthly data with a yearly seasonality

ARIMA Models for Forecasting

Model Development Process

  • The development of an ARIMA model involves an iterative process:
    • Model identification: Determine the appropriate orders (p,d,q) based on data characteristics, such as ACF and PACF plots
    • Parameter estimation: Fit the identified model to the data using maximum likelihood estimation or other optimization techniques
    • Diagnostic checking: Assess the adequacy of the fitted model by examining residuals for independence, normality, and homoscedasticity
    • Forecasting: Use the fitted model to generate future predictions and prediction intervals

Interpreting ARIMA Models

  • Interpreting ARIMA models requires understanding:
    • The significance and magnitude of the estimated coefficients
    • The impact of differencing and seasonal components on the forecasted values
  • The coefficients of the AR and MA terms indicate the strength and direction of the relationship between the current observation and the lagged observations or forecast errors
    • Example: A positive AR coefficient suggests that an increase in the lagged observation leads to an increase in the current observation
General Notation and Assumptions, Autoregressive model - Wikipedia

Forecasting with ARIMA Models

  • Forecasting with ARIMA models involves using the fitted model to generate future predictions
  • Prediction intervals are used to quantify the uncertainty associated with the forecasts
    • Example: A 95% prediction interval indicates the range within which the actual future value is expected to fall with a 95% probability
  • The accuracy of ARIMA forecasts depends on the quality of the model fit and the stability of the underlying data generating process

Differencing Order for ARIMA Models

Purpose of Differencing

  • Differencing is a technique used to remove non-stationarity in a time series by computing differences between consecutive observations
  • The goal of differencing is to obtain a stationary series suitable for ARIMA modeling
    • Example: If a time series exhibits a linear trend, first-order differencing can remove the trend and make the series stationary

Determining the Appropriate Order of Differencing

  • The appropriate order of differencing (d) can be determined by examining:
    • The plot of the original time series data
    • The ACF plot for signs of non-stationarity (trends or seasonal patterns)
  • Statistical tests, such as the Augmented Dickey-Fuller (ADF) test and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test, can assess the stationarity of a time series
    • The ADF test checks for the presence of a unit root (non-stationarity)
    • The KPSS test checks for the presence of stationarity
  • The order of differencing is typically limited to 0, 1, or 2 to avoid over-differencing and the loss of important information in the data

Limitations of Higher-Order Differencing

  • Higher orders of differencing (d > 2) may lead to over-differencing and the loss of important information in the data
  • Over-differencing can introduce unnecessary complexity and instability in the ARIMA model
    • Example: If a time series is already stationary, differencing it further may create an artificial pattern or introduce additional noise
  • It is essential to balance the need for achieving stationarity with the preservation of meaningful information in the data when determining the appropriate order of differencing
Pep mascot
Upgrade your Fiveable account to print any study guide

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Click below to go to billing portal → update your plan → choose Yearly → and select "Fiveable Share Plan". Only pay the difference

Plan is open to all students, teachers, parents, etc
Pep mascot
Upgrade your Fiveable account to export vocabulary

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Plan is open to all students, teachers, parents, etc
report an error
description

screenshots help us find and fix the issue faster (optional)

add screenshot

2,589 studying →