Symmetric matrices, the spectral theorem, PSD — Linear Algebra Module 8

Symmetric matrices are the hero of applied linear algebra. Covariance matrices are symmetric. Hessians of twice-differentiable functions are symmetric. Kernel matrices are symmetric. The spectral theorem says they all have beautifully clean eigen-structure.

The spectral theorem (real symmetric version)

If A is real and symmetric, then:

All eigenvalues of A are real.
Eigenvectors corresponding to distinct eigenvalues are orthogonal.
A admits an orthonormal eigenbasis: there exists an orthogonal Q (QᵀQ = I) such that A = QΛQᵀ.

math

A = Q Λ Qᵀ     (real symmetric A)
A = Σᵢ λᵢ qᵢ qᵢᵀ  (sum-of-rank-one outer products)

Why this is so good for finance

The covariance Σ of returns can be written Σ = Σᵢ λᵢ qᵢ qᵢᵀ. The qᵢ are the orthogonal directions of independent risk; the λᵢ are the variances along each direction. PCA, factor extraction, and risk decomposition all start from this decomposition.

Positive semi-definite (PSD) and positive definite (PD)

A is PSD if A is symmetric and vᵀAv ≥ 0 for all v.
A is PD if A is symmetric and vᵀAv > 0 for all v ≠ 0.
Equivalent: A is PSD iff all eigenvalues ≥ 0; A is PD iff all eigenvalues > 0.
Equivalent: A is PD iff there exists invertible L such that A = LLᵀ (Cholesky).

Covariance matrices must be PSD

For any vector v, vᵀΣv = Var(vᵀX) ≥ 0 (variances can't be negative). If your estimated Σ has a negative eigenvalue, it's not a valid covariance — most likely from a numerical issue or from combining covariances estimated over different sample periods. Always check eigenvalues; clip negatives or use a nearest-PSD projection.

Quadratic forms and portfolio variance

The portfolio variance wᵀΣw is a quadratic form. Using the spectral decomposition:

math

wᵀ Σ w = wᵀ (Σᵢ λᵢ qᵢ qᵢᵀ) w = Σᵢ λᵢ (qᵢᵀ w)²

The portfolio's risk is the eigenvalue-weighted sum of squared projections onto principal components. To minimise risk, you concentrate w in the directions of small λᵢ; to maximise it, in the directions of large λᵢ. The Markowitz minimum-variance portfolio is precisely the eigenvector of Σ associated with the smallest eigenvalue (up to budget normalisation).

Square roots of PSD matrices

If A = QΛQᵀ is PSD, define A^(1/2) = Q Λ^(1/2) Qᵀ, where Λ^(1/2) has √λᵢ on the diagonal. Then A^(1/2) · A^(1/2) = A. This is the matrix square root that appears in simulation: if z ~ N(0, I) then A^(1/2) z ~ N(0, A).

Cholesky vs eigen-square-root

Both A^(1/2) (eigen) and L (Cholesky, where A = LLᵀ) can be used to simulate from N(0, A). Cholesky is faster (one O(n³) factorisation vs full eigen-decomposition); the eigen-square-root is symmetric (A^(1/2) is symmetric, L is not), which matters in some derivations but rarely in code.

Condition number

For a PD matrix A, κ(A) = λ_max / λ_min. A condition number near 1 is excellent; values above 10⁴ indicate ill-conditioning. Covariance matrices estimated from short samples or many highly-correlated assets routinely have κ > 10⁸ — and any solver involving Σ⁻¹ will return garbage. This is the algebraic reason for shrinkage estimators and factor-based covariance models.

Exercise

A 3-asset covariance matrix has eigenvalues 0.04, 0.01, 0.0001 (variances per year). (1) Compute the condition number. (2) The minimum-variance portfolio is the eigenvector of the smallest eigenvalue. What annual volatility does it achieve? (3) Why might this 'free' low-vol portfolio be dangerous in practice?