Forecasting is the deliverable of most time-series modelling. The forecast itself, the uncertainty around it, and the diagnostics that justify both are the three things every forecasting framework must produce. Two facts dominate: forecast errors grow with horizon, and the unconditional mean dominates long-horizon point forecasts.
Optimal point forecast
Under squared-error loss, the optimal h-step-ahead forecast is the conditional expectation: X̂_{t+h|t} = E[X_{t+h} | F_t]. Under absolute-error loss, it's the conditional median. Under more exotic loss functions (asymmetric, quantile), other functionals.
AR(1) forecasts
For X_t = c + φ X_{t-1} + ε_t with stationary AR(1): X̂_{t+h|t} = c(1 + φ + ... + φ^(h-1)) + φ^h X_t. As h → ∞, X̂_{t+h|t} → c / (1 - φ) — the unconditional mean. Forecasts mean-revert geometrically.
Forecast errors
The h-step forecast error is e_{t+h|t} = X_{t+h} - X̂_{t+h|t}. For ARMA(p, q) with Wold MA(∞) coefficients ψ_j, the h-step forecast error variance is:
Var(e_{t+h|t}) = σ²(ψ_0² + ψ_1² + ... + ψ_{h-1}²)
Variance grows with h, plateauing at the unconditional variance as h → ∞ for stationary processes. For random walks, variance grows linearly — never converging.
Prediction intervals
Assuming Gaussian errors: PI_α = X̂_{t+h|t} ± z_α · √Var(e_{t+h|t}). For non-Gaussian errors, bootstrap or simulation-based intervals are more honest. Empirical financial data is heavy-tailed; Gaussian intervals are often too narrow.
Forecast evaluation
- MAE — mean absolute error.
- RMSE — root mean squared error.
- MAPE — mean absolute percentage error; problematic when actual is near zero.
- Directional accuracy — fraction of correct sign predictions; relevant for trading.
- Diebold-Mariano test — formal comparison of two forecasts' loss differentials.
Wold decomposition revisited
Every stationary, purely-nondeterministic process is an MA(∞) of its own innovations. The Wold theorem guarantees the existence; it doesn't guarantee identifiability of the MA coefficients from finite data. The practical content: ARMA models are flexible enough to approximate any stationary linear process, but with finite samples we can only get the first few ψ_j precisely.
Combining forecasts
Equal-weighted combinations of multiple forecasts often beat individual forecasts, even sophisticated ones. The Bates-Granger (1969) finding has held up across decades and domains. Bayesian model averaging is the principled generalisation; simple averaging is the practitioner's default.
Forecasting returns is hard for fundamental reasons
If you could forecast next month's stock return to, say, RMSE 1.5% (vs. unconditional vol of ~5%), you'd be running the world's most profitable hedge fund. The fact that the best forecast models barely beat 'the unconditional mean is zero' on most return series is not a failure of the methodology — it's the absence of forecastable signal in efficient prices.
Exercise
Fitting an AR(1) to monthly Kenya inflation gives c = 0.5%, φ = 0.7, σ_ε = 0.4%. Current inflation X_t = 6.0%. (1) Forecast inflation 1, 3, 6, 12 months ahead. (2) Compute the 95% prediction interval at h = 12. (3) Compare to the unconditional mean.