The fundamental linear-algebra problem: given a matrix A and a vector b, find x such that Ax = b. Every regression, every portfolio optimisation first-order condition, every Kalman update reduces to a system of this form. Understanding when it has a solution, when the solution is unique, and how to find it numerically is foundational.
The four cases of Ax = b
Let A be m×n. The system Ax = b admits a solution iff b lies in the column space of A. When a solution exists, it is unique iff the columns of A are linearly independent (rank n). The four cases:
- m = n, full rank: unique solution x = A⁻¹b. Square, invertible — the textbook case.
- m > n, full column rank: overdetermined. Generally no exact solution; the least-squares solution minimises ‖Ax - b‖ (Module 6).
- m < n, full row rank: underdetermined. Infinitely many solutions; the minimum-norm solution picks the one with smallest ‖x‖.
- Rank-deficient: existence and uniqueness fail jointly; you need the pseudo-inverse (Module 9).
Gaussian elimination
The classical algorithm: forward elimination reduces A to upper-triangular form U; back substitution then solves Ux = c. The operations — adding a multiple of one row to another, swapping rows, scaling a row — preserve the solution set.
Row operations are matrix multiplications
Every elementary row operation is left-multiplication by an elementary matrix E. The whole elimination process is therefore a product of such matrices applied to A. This is the algebraic origin of the LU decomposition.
LU decomposition
Gaussian elimination factors A as A = LU, where L is lower-triangular with 1s on the diagonal and U is upper-triangular. Once you have L and U, solving Ax = b for any number of right-hand sides costs only O(n²) per solve, versus O(n³) for the original elimination.
Step 1: Solve Ly = b (forward sub, O(n²))Step 2: Solve Ux = y (back sub, O(n²))
Why LU matters in production
In a Monte Carlo simulation you solve Ax = b thousands of times with the same A but different b. Factor A = LU once, in O(n³); then every subsequent solve is O(n²). This is a 100× to 10000× speed-up for typical n.
Pivoting
Without row swapping (pivoting), elimination breaks if a diagonal entry hits zero, and becomes numerically unstable if a pivot is very small. Partial pivoting — swap rows so the largest available element is the pivot — is the default in every production library. Full pivoting also swaps columns and is rarely needed.
Cholesky for SPD matrices
If A is symmetric and positive-definite (SPD) — covariance matrices, Hessians of strictly convex functions — then A = LLᵀ where L is lower-triangular. This is the Cholesky decomposition. It costs half as much as general LU and is the algorithm of choice for solving systems involving covariance matrices and for simulating correlated normals.
import numpy as npL = np.linalg.cholesky(Sigma) # Sigma = L Lᵀz = np.random.randn(n) # i.i.d. N(0,1)r = L @ z # r ~ N(0, Sigma)
Exercise
You need to simulate 10,000 daily return paths over 252 days for a 500-asset portfolio with covariance Σ. Naive implementation: for each day, compute Σ^(1/2) z. (1) Why is this wasteful? (2) Outline the efficient approach using Cholesky. (3) Estimate the speed-up.