Time series analysis is the backbone of business forecasting—and you'll be tested on your ability to select the right method for different data patterns. Whether you're predicting quarterly sales, modeling stock prices, or forecasting demand, these techniques transform historical data into actionable insights. The key isn't just knowing what each method does, but understanding when to apply it based on data characteristics like stationarity, seasonality, and trend.
Exam questions will push you to connect method selection to data properties. You're being tested on concepts like autocorrelation structure, differencing requirements, model assumptions, and forecast accuracy evaluation. Don't just memorize acronyms—know what problem each method solves and how to diagnose which approach fits your data. Master the diagnostic tools (ACF, PACF, unit root tests), understand the model-building logic, and you'll crush both multiple choice and FRQ scenarios.
Foundational Diagnostic Tools
Before selecting any forecasting model, you need to understand your data's structure. These tools help you identify patterns and verify assumptions—they're the first step in any time series workflow.
Stationarity and Unit Root Tests
Stationarity—the assumption that statistical properties (mean, variance) remain constant over time—is required for most classical time series models
Augmented Dickey-Fuller (ADF) test is the standard unit root test; a significant result (low p-value) indicates stationarity
Non-stationary data requires transformation through differencing or detrending before modeling with AR/MA methods
Autocorrelation and Partial Autocorrelation Functions
ACF (Autocorrelation Function) measures correlation between observations at different lags—use it to identify MA order (q)
PACF (Partial Autocorrelation Function) isolates direct relationships by removing intermediate lag effects—use it to identify AR order (p)
Pattern recognition is key: ACF cuts off sharply for MA processes; PACF cuts off sharply for AR processes
Trend Analysis and Decomposition
Decomposition separates a time series into three components: trend, seasonal, and residual (irregular)
Additive vs. multiplicative decomposition—choose additive when seasonal variation is constant; multiplicative when it scales with the level
Residual analysis after decomposition reveals whether patterns remain unexplained, guiding further modeling decisions
Compare: ACF vs. PACF—both measure correlation with past values, but PACF removes indirect effects. On an FRQ asking you to identify model order, remember: ACF → MA order, PACF → AR order.
Classical Univariate Models
These are the workhorses of time series forecasting—models that predict a single variable using its own historical values. The key distinction is what information each model uses: past values, past errors, or both.
Autoregressive (AR) Models
AR models predict future values as a linear combination of past observations—the model "regresses" on itself
Order (p) specifies how many lagged values are included; identified by examining where PACF cuts off
Stationarity requirement—AR models assume the series is stationary; apply differencing first if needed
Moving Average (MA) Models
MA models use past forecast errors (not past values) to predict future observations—they smooth out noise
Order (q) indicates how many lagged error terms are included; identified by examining where ACF cuts off
Invertibility condition ensures the model can be expressed as an infinite AR process—check this assumption
Autoregressive Integrated Moving Average (ARIMA) Models
ARIMA(p, d, q) combines AR and MA components with differencing (d) to handle non-stationary data
The "I" (Integrated) refers to differencing—taking d differences to achieve stationarity before fitting AR and MA terms
Box-Jenkins methodology is the systematic approach: identify orders using ACF/PACF, estimate parameters, check residuals
Compare: AR vs. MA models—AR uses past values, MA uses past errors. If an FRQ gives you ACF and PACF plots, look for the sharp cutoff to determine which component dominates.
Handling Seasonality
Many business time series exhibit repeating patterns—monthly sales spikes, quarterly earnings cycles, holiday effects. These models explicitly capture periodic behavior that simpler methods miss.
Seasonal ARIMA (SARIMA) Models
SARIMA extends ARIMA with seasonal terms: ARIMA(p,d,q)(P,D,Q)s where s is the seasonal period
Seasonal differencing (D) removes seasonal patterns; seasonal AR (P) and MA (Q) capture remaining seasonal autocorrelation
Model selection requires examining ACF/PACF at seasonal lags (e.g., lags 12, 24, 36 for monthly data)
Exponential Smoothing Methods
Weighted averaging applies exponentially decreasing weights to past observations—recent data matters more
Three variants match different data patterns: Simple (level only), Holt's (level + trend), Holt-Winters (level + trend + seasonality)
Smoothing parameters (α, β, γ) control how quickly the model adapts; optimized to minimize forecast error
Prophet Model
Facebook's Prophet handles strong seasonality, missing values, and holiday effects with minimal tuning
Additive components model trend, weekly/yearly seasonality, and user-specified events separately—highly interpretable
Business-friendly design makes it accessible for analysts without deep statistical training; ideal for quick, robust forecasts
Compare: SARIMA vs. Exponential Smoothing—SARIMA requires stationarity and careful order selection; exponential smoothing is more intuitive and adapts automatically. For exam scenarios with limited diagnostic information, exponential smoothing is often the practical choice.
Multivariate and Advanced Methods
When multiple variables influence each other, or when you need to model complex dependencies, these advanced techniques provide more flexibility. They capture relationships that univariate models cannot.
Vector Autoregression (VAR) Models
VAR models analyze multiple time series simultaneously—each variable is predicted by its own lags and lags of other variables
Granger causality tests within VAR frameworks determine whether one variable helps predict another
Impulse response functions show how shocks to one variable propagate through the system over time
State Space Models and Kalman Filtering
State Space framework separates observed data from unobserved (latent) states—flexible for complex dynamics
Kalman filter recursively estimates hidden states in real-time; optimal for linear, Gaussian systems
Missing data handling is a major advantage—the framework naturally accommodates gaps in observations
Spectral Analysis
Frequency domain approach decomposes time series into cyclical components using Fourier transforms
Periodogram identifies dominant frequencies—useful for detecting hidden cycles not obvious in time plots
Complementary to time domain methods; best for understanding what cycles exist before modeling how they evolve
Compare: VAR vs. univariate ARIMA—VAR captures cross-variable dynamics but requires more data and is prone to overfitting. Use ARIMA for single-variable forecasts; use VAR when understanding variable interactions is the goal.
Machine Learning Approaches
When traditional statistical assumptions break down or datasets are massive, machine learning methods can capture complex, nonlinear patterns. They trade interpretability for flexibility.
Long Short-Term Memory (LSTM) Networks
LSTMs are recurrent neural networks designed to learn long-range dependencies—they "remember" relevant past information
Gating mechanisms (input, forget, output gates) control information flow, solving the vanishing gradient problem
Data hungry—LSTMs excel with large datasets but may overfit small samples; require careful hyperparameter tuning
Model Validation and Selection
Choosing the right model requires rigorous testing. These techniques ensure your forecasts generalize to new data rather than just fitting historical patterns.
Time Series Cross-Validation
Temporal ordering must be preserved—you cannot randomly shuffle observations like in standard cross-validation
Rolling window uses fixed-size training sets that move forward; expanding window grows the training set over time
Out-of-sample testing is essential—never evaluate forecast accuracy on the same data used to fit the model
Forecasting Techniques and Accuracy Metrics
MAE (Mean Absolute Error) measures average absolute deviation—interpretable in original units
RMSE (Root Mean Squared Error) penalizes large errors more heavily; use when big misses are especially costly: RMSE=n1∑i=1n(yi−y^i)2
Ensemble methods combine multiple forecasts—often outperform any single model by averaging out individual errors
Compare: MAE vs. RMSE—both measure forecast accuracy, but RMSE penalizes large errors disproportionately. Choose MAE for robust evaluation; choose RMSE when outlier errors have serious business consequences.
Quick Reference Table
Concept
Best Examples
Diagnosing stationarity
ADF test, KPSS test, visual inspection
Identifying model order
ACF, PACF, information criteria (AIC/BIC)
Univariate forecasting (no seasonality)
AR, MA, ARIMA
Seasonal forecasting
SARIMA, Holt-Winters, Prophet
Multivariate analysis
VAR, State Space models
Nonlinear/complex patterns
LSTM, other neural networks
Model validation
Rolling-window CV, expanding-window CV
Accuracy evaluation
MAE, RMSE, MAPE
Self-Check Questions
You examine a time series and find the ACF decays slowly while the PACF cuts off after lag 2. What type of model is suggested, and what order would you initially try?
Compare and contrast ARIMA and exponential smoothing: What assumptions differ, and in what business scenario would you prefer one over the other?
A colleague fits an ARIMA model to raw sales data without checking stationarity. The ADF test shows a unit root. What should they do before proceeding, and why?
Which two methods would you consider for forecasting monthly retail sales with strong holiday effects and occasional missing data? Justify your choices based on method strengths.
You're evaluating two competing forecasting models. Model A has lower RMSE but higher MAE than Model B. What does this tell you about the error distributions, and which would you choose if large forecast errors are costly?