Vector Autoregression (VAR) Models
Vector Autoregression (VAR) models extend the idea of autoregressive models from a single time series to multiple time series at once. Instead of modeling one variable in isolation, a VAR treats every variable in the system as depending on its own past values and the past values of all other variables. This makes VAR models especially useful for studying how interconnected variables like GDP, inflation, and unemployment evolve together over time.
Vector Autoregression Models in Time Series
A univariate AR model predicts one variable from its own lags. A VAR generalizes this: with variables and lags, each variable gets its own equation where it's regressed on lagged values of every variable in the system.
For a two-variable VAR(1) with variables and , the system looks like:
Each equation has its own intercept, coefficients on all lagged variables, and an error term. The coefficients and are what capture the cross-variable dynamics that a univariate model would miss entirely.
Common applications of VAR models include:
- Forecasting multivariate time series where variables move together (GDP, inflation, unemployment)
- Analyzing feedback effects among variables (stock prices and exchange rates influencing each other)
- Studying the impact of shocks through impulse response functions (how a monetary policy shock ripples through the economy)
- Testing causal relationships using Granger causality tests (does money supply help predict inflation, or vice versa?)

Construction of VAR Models
Building a VAR model comes down to two main decisions: which variables to include, and how many lags to use.
Choosing the lag order
The lag order is the number of past time periods each equation looks back. Too few lags and you miss important dynamics; too many and you overfit the model and waste degrees of freedom (the number of parameters grows quickly with and ).
To find the right , you estimate the model at several candidate lag orders and compare them using:
- Akaike Information Criterion (AIC) and Schwarz Bayesian Information Criterion (SBIC/BIC): Both balance goodness of fit against model complexity. Lower values indicate a better trade-off. SBIC tends to favor more parsimonious (fewer-lag) models than AIC because it penalizes extra parameters more heavily.
- Likelihood ratio tests: Directly compare a model with lags against one with lags to see if the extra lag adds statistically significant explanatory power.
Estimating the model
Once you've chosen , each equation in the VAR is estimated separately using ordinary least squares (OLS). Because every equation has the same set of right-hand-side variables (the same lags of all variables), estimating each equation by OLS individually is actually efficient. The estimated coefficients capture how each variable responds to past movements in itself and in the other variables.

Interpretation of VAR Coefficients
With multiple equations and potentially many lagged terms, interpreting raw VAR coefficients can get overwhelming. There are a few tools that help.
Coefficient interpretation
Each coefficient tells you the estimated effect of a one-unit increase in a lagged variable on the current value of the dependent variable, holding everything else constant. For example, in the two-variable system above, tells you how a one-unit increase in last period's is associated with a change in this period's . You can run standard t-tests on individual coefficients to check significance, or use joint F-tests to assess whether all lags of a particular variable matter for a given equation.
Impulse Response Functions (IRFs)
IRFs are often more informative than staring at individual coefficients. An IRF traces how a one-unit shock to one variable's error term affects all variables in the system over subsequent time periods.
For example, you could shock the oil price equation by one standard deviation and then track how GDP responds over the next 10 periods. The IRF might show GDP dipping for a few quarters before gradually returning to baseline. Confidence bands (often computed via bootstrapping) are plotted around the IRF to show how precisely the response is estimated. If the confidence band includes zero at a given horizon, the response isn't statistically significant at that point.
Diagnostic Tests for VAR Models
After estimation, you need to verify that the model is well-specified. Skipping diagnostics can lead to misleading forecasts and impulse responses.
Stability check
A VAR model is stable if the effects of shocks eventually die out rather than exploding over time. To check this, compute the eigenvalues of the companion matrix (a reformulation of the VAR system). All eigenvalues must have modulus (absolute value) strictly less than 1, meaning they lie inside the unit circle. If any eigenvalue sits on or outside the unit circle, the model is unstable, and its impulse responses and forecasts are unreliable.
Residual diagnostics
These tests check whether the model's assumptions hold:
- Lagrange Multiplier (LM) test: Tests for serial correlation (autocorrelation) in the residuals. Significant serial correlation suggests the model hasn't captured all the temporal dynamics.
- White test: Tests for heteroscedasticity, meaning the variance of the error terms changes over time. Constant error variance is an assumption of OLS.
- Jarque-Bera test: Tests whether the residuals follow a normal distribution by checking skewness and kurtosis.
Remedial actions if diagnostics fail
If any of these tests flag problems, you have several options:
- Increase the lag order to capture dynamics the current model is missing
- Add exogenous variables or deterministic terms (like a time trend or seasonal dummies) if there are systematic patterns the VAR isn't picking up
- Transform the variables to achieve stationarity or stabilize variance (a logarithmic transformation is common for variables like GDP or prices that grow exponentially)