Risk management is the difference between traders who survive a decade and those who blow up in year two. The skills are: estimating potential loss, sizing positions appropriately, and enforcing discipline when losses occur. None of these are intellectually difficult; all of them are emotionally demanding. The graveyard of traders is full of people who knew the rules and broke them under pressure.
Value-at-Risk (VaR) — the formula
VaR estimates the loss threshold that should be exceeded only X% of the time over a defined horizon. The conventional number is 99% confidence at a 1-day horizon. A 99% 1-day VaR of USD 1 million means that on 99 of 100 trading days losses should remain below USD 1 million — and on the remaining 1 day in 100 losses can exceed USD 1 million, sometimes by a lot.
Parametric (variance-covariance) VaR — the most-used analytical form:VaR = V × z × σ × √hwhere:V = portfolio market value in the base currency— the dollar exposure being measuredz = the z-score of the standard normal corresponding to theconfidence level (one-tailed)— 95% confidence → z ≈ 1.645— 99% confidence → z ≈ 2.326— 99.9% confidence → z ≈ 3.090σ = the annualised volatility of portfolio returns, decimal form— e.g. 0.20 for a 20% annual-vol portfolioh = the horizon for which VaR is being measured, in years— 1-day VaR uses h = 1/252 (252 trading days a year)— 1-month VaR uses h ≈ 1/12√h = the square-root-of-time scaling that converts annualisedσ into the standard deviation over the horizon h;assumes returns are i.i.d. (independent and identicallydistributed) over the period.Worked example: $50m portfolio, 16% annualised vol, 99% 1-day VaR.σ_1day = 0.16 × √(1/252) = 0.16 × 0.0630 = 0.01008 (about 1.0% daily)VaR = $50m × 2.326 × 0.16 × √(1/252)= $50m × 2.326 × 0.01008= $1.172mReading: there is a 1% probability that losses will exceed $1.17m overany single trading day, assuming returns are normally distributed.The number is a threshold, not a maximum.
Three methods to compute VaR
- Parametric (variance-covariance): assume returns are normally distributed; use the formula above. Simple, fast, but normal-distribution assumption underestimates tail risk because real return distributions are fat-tailed (financial returns have far more extreme moves than a normal distribution predicts).
- Historical simulation: look at the worst 1% of historical daily moves from the last 1-3 years; apply them to current portfolio positions; the loss in that 1st-percentile historical day is the VaR. No distributional assumptions; captures real tail behaviour but is limited by what history happens to contain.
- Monte Carlo: simulate thousands of possible market scenarios using stochastic models calibrated to volatility and correlation estimates; observe the 99th percentile loss across the simulated scenarios. Most flexible (handles options and non-linear instruments naturally); computationally intensive.
VaR's blind spot
VaR tells you the threshold loss at the 99th percentile — but says nothing about what happens beyond that threshold. The 1% of days when losses exceed VaR are exactly when most trader blow-ups happen. Long-Term Capital Management (1998), the 2008 crisis, COVID-19 March 2020 — all moments when actual losses dwarfed VaR estimates. Senior risk managers complement VaR with stress testing (specific historical scenarios applied to current positions) and Expected Shortfall (also called CVaR), the average loss conditional on being in the worst 1% of cases. ES is mathematically coherent in a way VaR is not and is increasingly favoured by regulators.
Position sizing — the Kelly criterion
If you have an edge, how big should each bet be? Too small and you don't earn enough on the edge; too big and a string of losses bankrupts you before the edge has time to play out. The Kelly criterion gives the mathematically optimal bet size for maximising long-run geometric growth rate of wealth — first derived by Bell Labs researcher John Kelly in 1956 in a paper on information transmission, popularised in finance by Edward Thorp.
b · p − qf* = ─────────────bwhere:f* = optimal fraction of total bankroll to bet on a single play— expressed as a decimal between 0 and 1— a value below 0 means the bet has negative edge; don't take itb = the net odds received on the win — the multiple of stakeyou receive on top of the stake if you win— 1:1 even-money bet → b = 1— bet 100 to win 150 net (a 1.5:1 payoff) → b = 1.5p = probability of winning, expressed as a decimal between 0 and 1— the trader's estimate of their edge, often impreciseq = probability of losing, equal to 1 − p— the residual probabilityb · p − q = the expected value of the bet per unit staked— must be positive for Kelly to recommend taking itWorked example: 55% win rate, 1:1 even-money payoffb = 1.0, p = 0.55, q = 0.45f* = (1.0 × 0.55 − 0.45) / 1.0 = 0.10→ optimal stake is 10% of bankroll per tradeWorked example: 50% win rate, 1.5:1 payoff (win 1.5× stake)b = 1.5, p = 0.50, q = 0.50f* = (1.5 × 0.50 − 0.50) / 1.5 = 0.167→ optimal stake is 16.7% of bankroll per trade
Why each Kelly variable matters
- f* is the answer — the recommended bet size as a fraction of bankroll. Higher f* means more aggressive sizing.
- b captures the asymmetry of the pay-off. Positive-asymmetry bets (limited downside, large upside — e.g. long options) get larger Kelly fractions than even-money bets with the same win rate.
- p is the most-disputed input because it is genuinely uncertain. Most traders systematically over-estimate p; reality runs lower than self-reported probability of winning. This is why full-Kelly bets often blow up — not because the formula is wrong, but because the inputs are.
- The numerator (b · p − q) is the expected value of the bet per unit staked. When EV = 0, Kelly is zero (don't bet). When EV > 0, Kelly scales the bet to long-run-optimal size. When EV < 0, Kelly is negative, which mathematically means 'take the other side'.
Why practitioners use half-Kelly or quarter-Kelly
Full Kelly is mathematically optimal but produces brutally volatile equity curves — 50% drawdowns are common even with positive long-run expectation. The reason: full Kelly's optimality assumes the edge inputs (p, b) are known exactly. In practice they are estimated with error, and small over-estimates of p produce large over-bets that compound badly through losing streaks. Most professional traders use half-Kelly or quarter-Kelly: bet 25-50% of the full Kelly recommendation. This sacrifices a fraction of long-run growth for dramatically reduced drawdown variance. The trade is almost always worth it because most traders cannot emotionally survive 50% drawdowns even when the math says they should.
Stop-loss discipline
Every trade should have a predefined stop — a price at which you exit if the trade goes wrong. The stop should be set: (1) before entering, when you can think clearly; (2) at a level where being stopped tells you your thesis was wrong; (3) consistently — not moved further away when the trade goes against you ('it'll come back'). Moving stops adversely is the single most common blow-up pattern. Senior traders enforce: 'if you can't accept the stop, don't take the trade.'
The risk-management rules that work
- Never risk more than 1-2% of capital on a single trade.
- Never have more than 20-30% of capital exposed across all trades at once.
- Diversify across uncorrelated bets — correlations rise during stress.
- Sleep on it before adjusting any rule mid-drawdown.
- Journal every trade — entry, thesis, stop, exit, lesson. Review monthly.
Exercise
A trader has KES 10m of capital. They want to trade Kenyan equities with a strategy that wins 55% of the time, paying 1:1 (win same as loss). (1) What's full-Kelly position size? (2) Why might they use 'half-Kelly' instead? (3) What's the worst-case drawdown at full-Kelly vs half-Kelly? (4) What stop-loss level would you require per trade?