Skip to content
Module 10 of 1265 min readIntermediate

Time series — stationarity and AR/MA

Why you can't just run OLS on monthly data, the unit-root problem, and a primer on ARIMA, cointegration, and what 'long-run' means.

83%

Listen along

Read “Time series — stationarity and AR/MA” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Time series data — observations on the same variable across time — looks superficially similar to a panel with one unit, but the statistical machinery is fundamentally different. The reason is dependence: today's value almost always depends on yesterday's, in ways that break the OLS independence assumptions.

Stationarity

A series is weakly stationary if its mean, variance, and autocovariance don't depend on the time index t. Most economic series in levels — GDP, prices, exchange rates, money supply — are non-stationary, growing or drifting over time.

Why stationarity matters

Standard OLS asymptotics — unbiasedness, consistency, t-distributions — assume stationary regressors and errors. With non-stationary series, you can get t-statistics in the hundreds for relationships that are pure coincidence (spurious regression). Always test for stationarity before regressing time series.

Unit roots

An AR(1) model: yₜ = ρyₜ₋₁ + εₜ

  • |ρ| < 1: stationary, shocks die out exponentially
  • ρ = 1: unit root (random walk), shocks have permanent effects
  • |ρ| > 1: explosive, never observed in real data

The augmented Dickey-Fuller (ADF) test rejects the unit root null when |ρ| < 1. Phillips-Perron is an alternative robust to mild autocorrelation in errors. KPSS reverses the null (tests stationarity, rejects when there's a unit root). Use them together.

Differencing

If yₜ is non-stationary, take first differences: Δyₜ = yₜ − yₜ₋₁. Often Δyₜ is stationary (we say yₜ is integrated of order 1, or I(1)). For most macro series, first differences are stationary; second differences are rarely needed.

I(1) regressions

Regressing one I(1) variable on another in levels is dangerous — risk of spurious regression. Either difference both sides (and lose long-run information), or test for cointegration.

Spurious regression

Granger and Newbold (1974): generate two independent random walks, regress one on the other. The R² is often high; the t-statistic often exceeds 100. Yet there is no relationship — both series are pure noise that happens to drift in similar directions. The regression is detecting trend, not causation.

The 'Spurious Correlations' viral website (margarine consumption vs divorce rate, etc.) is mostly this phenomenon — independent series with strong trends.

Cointegration

If two non-stationary series move together such that some linear combination of them IS stationary, they are cointegrated. The interpretation: they share a common stochastic trend, and any divergence is temporary.

math
yₜ = α + β · xₜ + uₜ, where uₜ is stationary even though yₜ and xₜ are not

Engle-Granger test: estimate the cointegrating regression, save residuals, test residuals for stationarity (ADF on residuals with adjusted critical values). Johansen's test handles multivariate cases.

Examples of cointegration

  • Real interest rate parity: domestic real rate and foreign real rate cointegrated
  • PPP: nominal exchange rate and price ratio cointegrated (long-run, slowly)
  • Term structure: long-rate and short-rate cointegrated within currency
  • Income and consumption: cointegrated under permanent-income hypothesis

Error-correction models

When yₜ and xₜ are cointegrated, the error-correction model captures both the long-run equilibrium and short-run dynamics:

math
Δyₜ = γ · (yₜ₋₁ − α − β · xₜ₋₁) + θ · Δxₜ + εₜ

γ < 0 measures speed of return to equilibrium. The bracket is the disequilibrium term — how far off the long-run relationship the previous period was. Δxₜ captures contemporaneous short-run effects. ECM is the workhorse for any model where there's a clear long-run anchor with short-run noise.

ARMA and ARIMA

AR(p): yₜ = c + φ₁yₜ₋₁ + ... + φₚyₜ₋ₚ + εₜ — autoregressive, p lags.

MA(q): yₜ = μ + εₜ + θ₁εₜ₋₁ + ... + θ_qεₜ₋_q — moving average, q lagged shocks.

ARMA(p,q) combines both. ARIMA(p,d,q) takes d-th differences first to handle non-stationarity. Box-Jenkins methodology: identify p,d,q from autocorrelation and partial-autocorrelation plots, estimate, validate via residual diagnostics.

Newey-West and HAC standard errors

Time series errors are often autocorrelated even when the model is well-specified. Newey-West (1987) provides heteroskedasticity-and-autocorrelation-consistent (HAC) standard errors. Bandwidth ~ T^(1/3) is the standard rule. Always use HAC SEs when estimating regressions on time-series data.

Granger causality

Test whether past values of x help predict y beyond what y's own lags explain. Predictive notion, not causal — 'Granger-causes' is a regrettable misnomer. Useful for checking lead-lag relationships and as a falsification test (does the wrong direction of 'causation' show up?). Always run on stationary series.

Exercise

You have monthly data on Kenya's CPI and the KES/USD exchange rate from 2010-2025. You want to estimate the pass-through from FX depreciation to CPI. Outline the steps before running the regression.

Loading progress…
LeadAfrikPublic Economics Hub