Probability is the language we use to talk about uncertainty. Every model in modern finance — bond pricing, options, portfolio risk, credit default — is at heart a probability calculation. Mastering the foundations means more than memorising formulas; it means developing the reflexes that catch the casino-style errors that keep showing up on trading floors and in regulatory filings.
Sample spaces, events, axioms
A probability space is a triple (Ω, F, P). Ω is the sample space — the set of all possible outcomes. F is a σ-algebra of events (subsets of Ω we can assign probabilities to). P is a function from F to [0, 1] satisfying three axioms (Kolmogorov):
- Non-negativity: P(A) ≥ 0 for any event A.
- Normalisation: P(Ω) = 1.
- Countable additivity: for disjoint events A₁, A₂, ..., P(∪Aᵢ) = Σ P(Aᵢ).
Everything else — independence, conditional probability, Bayes' rule, the entire edifice of probability theory — follows from these three axioms. Re-deriving the basics with this in mind is the difference between treating probability as ritual and treating it as a discipline.
Conditional probability and independence
P(A | B) = P(A ∩ B) / P(B), P(B) > 0A, B independent ⟺ P(A ∩ B) = P(A) P(B)
Independence ≠ uncorrelatedness
Independence is a strict condition on the joint distribution: P(A ∩ B) = P(A)P(B) for every pair of events. Uncorrelatedness is a weaker condition on second moments: E[XY] = E[X]E[Y]. Independent ⟹ uncorrelated, but not the other way around. A 'zero correlation' on a desk often hides a strong tail dependence — the failure that broke 2008.
Bayes' rule
P(A | B) = P(B | A) P(A) / P(B)
The most-used theorem in applied probability. Rearranging conditional probabilities is just bookkeeping, but Bayes' rule is the bookkeeping that lets you reason about hypotheses given evidence — the foundation of Bayesian inference (Module 8) and of every modern credit scoring system.
Base-rate neglect — the most expensive cognitive bias in finance
A credit screen has 99% true-positive and 99% true-negative rates. The base rate of fraudsters is 0.5%. If the screen flags an applicant as fraudulent, the probability they actually are is P(F|+) = (0.99·0.005) / (0.99·0.005 + 0.01·0.995) ≈ 0.33 — only one in three. Most people, including most analysts, intuit ~99%. Internal anti-fraud and AML systems calibrated on this intuition flood case-workers with false positives.
Common counting traps
- Birthday paradox: 23 random people give a > 50% chance of two sharing a birthday. The collision count grows quadratically in n.
- Monty Hall: switching doors gives a 2/3 win rate, not 1/2. The host's choice conditions the remaining doors.
- Conjunction fallacy: P(A ∩ B) ≤ min(P(A), P(B)). Surveys violate this routinely (Linda the bank teller).
Exercise
A trading desk reports a strategy with a Sharpe ratio of 2 after 1 year (252 trading days). Assuming returns are i.i.d. normal and the true Sharpe is zero, what's the probability of observing a measured Sharpe ≥ 2 by chance?