Vector autoregressions (VARs) extend AR models to multiple series. They are the lingua franca of empirical macroeconomics — Sims's 1980 Nobel-cited paper introduced them as an atheoretical alternative to structural macro models. In finance, VARs underpin macro factor modelling, multivariate risk forecasting, and impulse-response analysis.
VAR(p) definition
Y_t = c + A₁ Y_{t-1} + A₂ Y_{t-2} + ... + A_p Y_{t-p} + u_tY_t ∈ Rⁿ, A_i ∈ Rⁿˣⁿ, u_t ~ WN(0, Σ_u)
Each component of Y_t is regressed on the recent past of all components. With n = 3 and p = 4, you have 3 equations, each with 1 + 3·4 = 13 parameters = 39 params plus the n(n+1)/2 = 6 unique entries of Σ_u.
Estimation
Each equation can be estimated by OLS, equation by equation, because all equations have the same right-hand side variables (same regressors). This is a special case of SUR (seemingly unrelated regressions) where OLS = GLS.
Lag selection
Same AIC/BIC trade-off as in univariate ARMA. BIC heavily penalises additional lags for n > 1; in practice, financial monthly VARs rarely have p > 4, and daily VARs rarely p > 5.
Granger causality
X Granger-causes Y if past X helps predict Y beyond the predictive power of past Y alone. Operationally: in a VAR for (X, Y), test the joint significance of the X-lag coefficients in the Y equation. Granger causality is not the same as economic causality — it's predictability — but it remains a useful first-pass diagnostic.
Impulse-response functions
An IRF traces the response of every variable in the system to a one-time shock in one variable. Computed by inverting the VAR to the MA(∞) representation. The dominant decay rate is governed by the largest eigenvalue of the companion matrix.
Identification — the deep issue
Raw VAR residuals are typically contemporaneously correlated; the matrix Σ_u is not diagonal. An IRF to 'a shock in monetary policy' requires identifying which linear combination of the reduced-form shocks represents that policy shock. Choleski decomposition gives one identification (recursive ordering); long-run restrictions (Blanchard-Quah), sign restrictions, and structural VARs (SVARs) give alternatives. The choice matters for interpretation.
VECM — VAR for cointegrated systems
ΔY_t = c + Π Y_{t-1} + Σ Γ_i ΔY_{t-i} + u_t
When components of Y are individually I(1) but cointegrated, the VAR in levels is misspecified. The vector error-correction model splits dynamics into long-run (Π Y_{t-1}, where rank(Π) = number of cointegrating relations) and short-run (Γ_i ΔY_{t-i}) components. Estimated jointly by Johansen ML.
Forecast error variance decomposition
Given a structural identification, the variance of the h-step-ahead forecast error of variable i can be decomposed into shares attributable to shocks in each variable. Useful for asking 'how much of equity volatility is driven by monetary policy shocks vs term premia shocks'.
VARs in finance
- Equity-bond-currency dynamics: cross-asset risk forecasting and risk-on/risk-off identification.
- Macro factor models for sovereign credit: VAR on growth, inflation, FX, debt, with credit spreads.
- Yield curve macro models: VAR on a few yield-curve factors plus macro inputs (Ang-Piazzesi 2003).
- Risk premia decomposition: separating macro from financial drivers via SVAR.
Curse of dimensionality
VAR(p) on n variables has n²p slope parameters. For n = 20, p = 4: 1600 parameters. Estimating that on monthly data (~360 obs) is over-fitting. Bayesian VARs (Minnesota prior, Litterman 1986) and LASSO-VARs shrink coefficients toward sensible defaults — the modern way to handle larger systems.
Exercise
You estimate a VAR(2) on monthly data for inflation, output gap, and the CBR (3 variables, 240 monthly obs). (1) How many parameters does this have? (2) Granger test: F-stat of CBR-lags on output-gap equation is 3.2 with df (2, 232). Is CBR Granger-causal for output? (3) The largest eigenvalue of the companion matrix is 0.97. Interpret.