Skip to content
Module 10 of 1255 min readIntermediate

The multivariate normal

Mean vector, covariance matrix. Marginal and conditional distributions. Mahalanobis distance. Why so much quant assumes MVN.

83%

Listen along

Read “The multivariate normal” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

The multivariate normal (MVN) is the most-used joint distribution in quantitative finance. It is closed under marginalisation, conditioning, and linear combinations; its dependence structure is fully captured by a covariance matrix; and its tails are zero-tail-dependent. The last property is the catch — but for most short-horizon risk modelling, MVN is the appropriate first model.

Definition and density

math
X ~ N(μ, Σ), X ∈ Rⁿ, Σ positive definite
f(x) = (2π)^(-n/2) |Σ|^(-1/2) exp(-(1/2)(x - μ)ᵀ Σ⁻¹ (x - μ))

μ is the mean vector. Σ is the n×n covariance matrix. The exponent contains the Mahalanobis distance squared: (x - μ)ᵀΣ⁻¹(x - μ).

Mahalanobis distance

A unit-free distance between a point and a distribution. Replaces 'how many standard deviations away' for the multivariate case: it standardises by the full covariance, not just the marginal scales.

  • Mahalanobis-squared distance for X ~ N(μ, Σ) is χ²_n distributed.
  • Used for multivariate outlier detection, statistical arbitrage signals, model anomaly detection.
  • Sensitive to Σ-estimation error: garbage Σ gives garbage distances.

Marginal and conditional distributions

Partition X = (X₁, X₂)ᵀ with corresponding partitions of μ and Σ. Then:

  • Marginal: X₁ ~ N(μ₁, Σ₁₁). Just drop the rows and columns.
  • Conditional: X₁ | X₂ = x₂ ~ N(μ₁ + Σ₁₂Σ₂₂⁻¹(x₂ - μ₂), Σ₁₁ - Σ₁₂Σ₂₂⁻¹Σ₂₁).

Linear regression is a conditional MVN computation

If (Y, X) are jointly normal, the best predictor of Y given X is the conditional mean — a linear function of X with coefficients Σ_{YX} Σ_{XX}⁻¹. The conditional variance is Σ_YY - Σ_{YX}Σ_{XX}⁻¹Σ_{XY} — the residual variance. The Frisch-Waugh-Lovell theorem, partial correlation, and OLS all follow from this single algebraic fact.

Linear transformations

math
X ~ N(μ, Σ), Y = AX + b ⟹ Y ~ N(Aμ + b, AΣAᵀ)

Every linear combination of MVN components is normal. This is why portfolio returns are normal under the MVN assumption — and why so much of risk modelling collapses to elegant closed-form expressions.

Sampling from MVN

Generate Z ~ N(0, I) (i.i.d. standard normals), then X = μ + L Z where L = chol(Σ). The Cholesky factor L is much cheaper than the eigen-square-root and is the production-standard approach.

python
import numpy as np
def sample_mvn(mu, Sigma, n_draws):
L = np.linalg.cholesky(Sigma)
Z = np.random.randn(len(mu), n_draws)
return mu[:, None] + L @ Z

Empirical adequacy

Daily equity log returns are usually well-approximated as MVN within sample but show clear non-normality in joint tails (cluster crashes, correlation breakdowns). Weekly and monthly returns get closer to MVN. For risk modelling beyond a one-day horizon: stress tests, regime models, or t-copulas with normal marginals.

Exercise

Two assets have annualised expected returns (10%, 8%) and a covariance matrix with diagonal (0.04, 0.0225) and off-diagonal 0.012. (1) What is the correlation between the two assets? (2) Compute the conditional expected return on asset 1 given that asset 2 returned 4%. (3) Compute the conditional variance.

Loading progress…
LeadAfrikPublic Economics Hub