Numerical methods are where the maths meets the machine. Every formula in this course will eventually be evaluated in floating-point arithmetic, and every approximation will accumulate error. Understanding floating-point representation, machine epsilon, and conditioning is the prerequisite for trusting any number a computer hands you.
IEEE 754 double precision
A double-precision float uses 64 bits: 1 sign + 11 exponent + 52 mantissa. This gives roughly 15-17 significant decimal digits and a range of about ±10^308. Machine epsilon (the smallest x such that 1 + x ≠ 1 in fp) is ε_mach ≈ 2.22 × 10⁻¹⁶.
Three sources of error
- Representation error: not every real number can be exactly represented. 0.1 in binary is a repeating fraction, so 0.1 + 0.2 ≠ 0.3 exactly.
- Round-off error: arithmetic operations introduce tiny errors of relative size ≈ ε_mach. Accumulated over many operations, these can grow.
- Cancellation error: subtracting two near-equal numbers loses precision. Computing (1 + 10⁻¹⁵) - 1 returns ~10⁻¹⁵ correctly but with much fewer significant digits than the inputs.
Catastrophic cancellation in finance
Classic example: the variance computed as E[X²] - E[X]² when these two terms are nearly equal. The naive formula loses precision because both terms are large but their difference is small. The two-pass / Welford algorithm computes the variance directly without subtracting near-equal quantities — always use it.
Condition number
For a function f, the relative condition number is |f'(x) · x / f(x)|. For matrix problems, κ(A) = ‖A‖ · ‖A⁻¹‖. Large condition number means small input changes produce large output changes — the problem is ill-conditioned.
Condition number ε rule of thumb
Solving Ax = b in floating-point loses roughly log₁₀(κ(A)) digits of accuracy. With ε_mach = 10⁻¹⁶ and κ = 10¹⁰, you have only 6 reliable digits in the answer. Always know your κ before trusting numerical results.
Numerical stability of algorithms
An algorithm is backward-stable if the computed answer is the exact answer for a slightly-perturbed input. Most modern numerical-library algorithms (LU with partial pivoting, QR via Householder, SVD via Golub-Kahan-Reinsch) are backward-stable. Naive algorithms (normal equations for least squares, classical Gram-Schmidt for QR) often are not.
Vectorisation and SIMD
Modern CPUs do 4-16 floating-point operations per cycle when data is properly aligned. NumPy, BLAS, and LAPACK exploit this; Python loops do not. The 100× speed gap between numpy.dot and a Python for-loop is the SIMD gap.
Avoiding loss of precision
- Use log-space when probabilities are tiny. log(p₁ · p₂ · ... · p_N) = Σ log p_i.
- Reorder sums to add similar-magnitude numbers first (Kahan summation for ultimate precision).
- Avoid catastrophic cancellation: rewrite (1 - cos(x)) as 2 sin²(x/2) for small x; rewrite √(x² + 1) - x as 1/(√(x² + 1) + x) for large x.
- When the input range is unknown, normalize before computing (max-shift trick for softmax/logsumexp).
Single vs double precision
Doubles (64-bit) are the default in scientific Python. Singles (32-bit, ε ≈ 10⁻⁷) save memory and run faster on GPUs. Production option-pricing code typically uses doubles; large-scale ML often uses singles or even half-precision floats with explicit mixed-precision arithmetic.
Exercise
Why does the naive sample variance s² = (1/n) Σ (x_i - x̄)² perform better numerically than the algebraically-equivalent s² = (1/n) Σ x_i² - x̄²? Construct a numerical example to show the difference.