LLN, CLT, and modes of convergence — Stats for Finance Module 5

The law of large numbers and the central limit theorem are the two licences that justify nearly every statistical procedure in finance. They explain why we can speak confidently about averages, why the normal distribution is everywhere, and why both promises become unreliable when their assumptions break.

Law of large numbers

Let X₁, X₂, ... be i.i.d. with finite mean μ. The sample mean X̄_n = (1/n) Σ Xᵢ converges to μ as n → ∞.

Weak LLN: X̄_n converges in probability to μ. P(|X̄_n - μ| > ε) → 0 for any ε > 0.
Strong LLN: X̄_n converges almost surely to μ. Pr(lim X̄_n = μ) = 1.

LLN can fail spectacularly

The Cauchy distribution has no mean (the integral diverges). Sample averages of i.i.d. Cauchy variates are themselves Cauchy — they never settle down. Equity returns are not Cauchy, but they are heavy-tailed; the LLN converges, but extremely slowly. The 'sample mean = true mean' approximation is poor in finance for any horizon shorter than decades.

Central limit theorem (classical)

Let X₁, X₂, ... be i.i.d. with mean μ and finite variance σ². Then:

math

√n · (X̄_n - μ) / σ  →d  N(0, 1)

Equivalently, for large n, X̄_n is approximately N(μ, σ²/n). The standard error shrinks like 1/√n — the slowest convergence rate that still gets us somewhere.

How large is 'large n'?

Common rule of thumb: n ≥ 30. But it depends entirely on the underlying distribution. For symmetric, well-behaved data, n = 20 is fine. For heavy-tailed financial returns, n = 1000 may not be enough — the Berry-Esseen rate of approximation depends on the third moment, which is large for skewed/heavy-tailed data.

CLT for products doesn't hold

The CLT applies to sums. Multi-period gross returns are products of one-period returns; their distributions are lognormal even when one-period log returns are normal — and dramatically non-normal when one-period returns are heavy-tailed. Sum log returns instead of multiplying gross returns when invoking CLT-style approximations.

Functional CLT / Donsker's theorem

Beyond the classical CLT: a properly rescaled random walk converges as a stochastic process to Brownian motion. This is the foundation of the entire continuous-time-finance edifice — the bridge from discrete-time models to Black-Scholes that we develop in Stochastic Calculus Module 1.

Modes of convergence

Almost sure convergence: P(lim Xₙ = X) = 1. The strongest.
Convergence in probability: P(|Xₙ - X| > ε) → 0 for any ε. Implied by a.s. convergence.
Convergence in distribution: F_n(x) → F(x) at continuity points. The weakest; what the CLT gives.
Convergence in mean square (L²): E[(Xₙ - X)²] → 0. Used in the Itô integral construction.

Delta method

If √n (X̄_n - μ) →d N(0, σ²) and g is differentiable at μ, then √n (g(X̄_n) - g(μ)) →d N(0, σ² (g'(μ))²). This is how you get standard errors for non-linear functions of estimators — e.g., a Sharpe ratio's standard error from the mean and variance estimators.

Exercise

A strategy has a true mean daily return μ = 0.05% and daily volatility σ = 1.0%. Returns are i.i.d. normal. (1) After T days of trading, what is the standard error of the realised mean return? (2) How many days T are needed so that the standard error of the realised mean is less than μ itself? (3) Interpret.