Identification, ACF/PACF, AIC/BIC — Time Series Module 4

Once you decide to fit an ARMA model, choosing the right (p, q) is a statistics-and-craft problem. Information criteria, the ACF/PACF inspection, residual diagnostics, and out-of-sample performance all play a role. The temptation to over-fit is enormous and the cost of over-fitting in forecasting is real.

Box-Jenkins methodology

Identification: examine ACF and PACF, decide tentative p and q.
Estimation: fit by conditional or exact maximum likelihood.
Diagnostic checking: examine residuals for whiteness and absence of structure.
Forecasting: produce point and interval forecasts, evaluate out-of-sample.

ACF and PACF cheat sheet

AR(p): PACF cuts off at lag p; ACF decays gradually.
MA(q): ACF cuts off at lag q; PACF decays gradually.
ARMA(p, q): both decay gradually; identification is harder.
Non-stationary: ACF decays very slowly; first-difference and re-inspect.

Information criteria

math

AIC = -2 ln L + 2k       (k = number of parameters)
BIC = -2 ln L + k ln T   (heavier penalty for k)
HQIC = -2 ln L + 2k ln ln T  (between AIC and BIC)

Pick the model with the smallest IC value. AIC selects more liberally (asymptotically picks a model at least as large as the truth). BIC selects more parsimoniously (consistent for finding the true model when it's nested in the candidate set). Practitioner default: report both, prefer BIC for forecasting.

Don't grid-search blindly

Searching over (p, q) ∈ {0,...,5}² gives 36 candidates. Selecting the minimum-AIC across them is a textbook recipe for over-fitting. Use parsimony: start with (1,0), (0,1), (1,1) and only expand if diagnostics demand. The 'best' model in-sample is rarely the best out-of-sample.

Residual diagnostics

Ljung-Box on residuals: should be no autocorrelation (p-value > 0.05).
Ljung-Box on squared residuals: should detect remaining conditional heteroskedasticity (motivates GARCH).
QQ-plot vs normal: check distributional fit; expect deviations in the tails for financial residuals.
Residual time plot: look for trends, outliers, structural breaks the model missed.

Cross-validation for time series

Standard k-fold CV mixes future and past data and is invalid for time series. Use rolling-origin (walk-forward) cross-validation: fit on [1, t], predict t+1, expand window, repeat. Compute forecast errors out-of-sample and aggregate. This is the closest analogue of holding out a future test set.

Structural breaks

Chow test (known break date) and supremum-Wald / sup-LM (unknown break date) detect parameter changes. Financial series are riddled with breaks: regime shifts in monetary policy, currency-board adoptions, post-2008 zero-rate regime. Modelling these explicitly is critical; ignoring them yields parameter estimates that are weighted averages over incompatible regimes.

Exercise

You fit ARMA(1,1), ARMA(2,1), and ARMA(1,2) to log returns of a 5-year daily series (1260 obs). The log-likelihoods are -2150, -2148.5, -2149. (1) Compute AIC and BIC for each. (2) Which model would AIC pick? Which would BIC pick? (3) Discuss the discrepancy.