Skip to content
← World Cup 2026 predictions
Methodology · v1Snapshot · 2026-05-14

Methodology.

Everything that goes into the model, exposed. No black boxes. Every equation, every parameter, every choice that shaped the output. Critique welcome.

1. The problem

Given a fixture between two football teams, predict three things: the probability the home team wins, the probability of a draw, and the probability the away team wins. As a secondary output, predict the most likely scoreline and an expected-goals estimate for each side.

Doing this honestly requires (a) a strength measure for each team, (b) a scoring model that translates strength into goal expectations, (c) a stochastic model for goal counts, and (d) some adjustment for home advantage. We use two independent specifications and present them side-by-side.

2. Inputs

Each of the 48 qualified teams has two strength estimates:

  • FIFA Men's World Ranking points as of 2026-05-14. Computed by FIFA using their points-exchange formula introduced in 2018 (modelled on Elo). The April 2026 ranking was the last update before the tournament; the next one (9 June 2026) lands just before kick-off.
  • Approximate Elo ratings from eloratings.net. Elo updates after every match, so we use the early-April 2026 snapshot to keep the cut-off consistent with FIFA.

We use Elo as the active rating in both models. FIFA points are reported for transparency and calibration checks. Where a small federation lacks a reliable Elo, we estimate from CONFED ranks and recent qualifier results and flag the row as ESTIMATED.

We deliberately do not use: bookmaker odds (would defeat the point), squad transfer-market valuations (proprietary and unstable), injury or fitness data (not reliably available before kick-off), or recent-form metrics other than what is already absorbed by the Elo updating process.

3. Model 1 — Dixon-Coles Poisson

From Dixon & Coles (1997, Applied Statistics), the canonical match-prediction model in academic football statistics. Assume each team scores goals as a Poisson process with rate λ. The joint distribution of (home goals, away goals) is approximately the product of two independent Poissons, with a small correction for low-score correlation.

For a match where Elo_H is the home Elo and Elo_A the away Elo:

λ_baseline(elo) = 1.35 + 0.40 · (elo - 1500) / 500
λ_H = max(0.05, λ_baseline(Elo_H) + home_adv(H) - 0.5 · λ_baseline(Elo_A) + 0.5 · 1.35)
λ_A = max(0.05, λ_baseline(Elo_A) + away_adv(A) - 0.5 · λ_baseline(Elo_H) + 0.5 · 1.35)

P(home h, away a) = Poisson(h; λ_H) · Poisson(a; λ_A) · τ(h, a)

τ(0,0) = 1 - λ_H · λ_A · ρ      with ρ = -0.12
τ(0,1) = 1 + λ_H · ρ
τ(1,0) = 1 + λ_A · ρ
τ(1,1) = 1 - ρ
τ(h,a) = 1   otherwise

The cross-term subtracts half the opponent's baseline rate so a strong team facing a strong defence still scores less than the same strong team facing a weak defence. This isn't the original Dixon-Coles attack/defence parameterisation — we collapse to a single Elo per team for simplicity — but it produces statistically similar outputs at the international-fixture level.

The τ adjustment is the Dixon-Coles correction. Without it, independent Poisson under-predicts draws (specifically 0-0 and 1-1) at the rates actually observed in football data. ρ = -0.12 is the value the original paper estimated, replicated by many subsequent studies.

Outcome probabilities sum the score grid:

P(home win) = Σ_{h > a} P(h, a)
P(draw)     = Σ_{h = a} P(h, a)
P(away win) = Σ_{h < a} P(h, a)

4. Model 2 — Elo baseline

The simpler companion model uses the standard Elo win-probability formula. With effective Elo ratings adjusted for home advantage (see §5):

Δ = Elo_H_eff - Elo_A_eff
p_home_no_draw = 1 / (1 + 10^(-Δ / 400))
p_draw = 0.25     (empirical base rate for World Cup matches)
P(home win) = p_home_no_draw · (1 - p_draw)
P(away win) = (1 - p_home_no_draw) · (1 - p_draw)

λ_H = max(0.15, 1.35 + Δ/1000 + home_adv(H))
λ_A = max(0.15, 1.35 - Δ/1000 + home_adv(A))

This is a textbook Elo system with a flat draw share. It will under-fit certain effects (no low-score correlation, draws not endogenous to team strength) but it doubles as a sanity check on Dixon-Coles. When the two models agree, we're more confident. When they disagree by more than 10 percentage points, we flag the match.

5. Home advantage

Three regimes:

  • Host home advantage (Mexico, Canada, USA playing on their own soil): +0.30 expected goals (Dixon-Coles) or +100 Elo points (Elo baseline). The continental average for the empirical World Cup home effect is between 0.25 and 0.40 goals; we sit at the middle.
  • Continental home advantage (CONCACAF teams playing on CONCACAF soil but not at their literal home): +0.10 goals or +40 Elo. Mexico playing in Boston gets a smaller boost than Mexico playing in Mexico City.
  • Neutral: zero. A European team playing another European team in Toronto gets no home effect.

We do not adjust for altitude, climate, or specific venue effects in v1. These are real but second-order and difficult to defend without far more data than we have.

6. Group-stage standings

For each group, expected points per team is the sum across three matches of:

xPts = 3 · P(win) + 1 · P(draw)

The expected-points table gives a reasonable ordering of likely group winners and runners-up but it underestimates variance. A team with xPts=4.5 and another with xPts=4.2 are statistically very close; the difference is small relative to the standard error.

6½. Tournament-wide Monte Carlo (v1.1)

For knockout-stage projections and the "champion%" numbers on the front page, we run a full tournament simulation 10,000 times. In each path:

  1. For every group-stage fixture, sample home and away goals from independent Poissons with the Dixon-Coles λ values. Tally points, GD, GF.
  2. Determine the top 2 from each group plus the 8 best 3rd-placed teams (sorted by points, GD, GF with a uniform random tie-breaker).
  3. Pair the 32 qualified teams into the knockout bracket (v1.1 uses randomised seeded pairing; v1.2 will adopt the exact FIFA bracket pairing once published per group).
  4. For each knockout match, sample the outcome from the ensemble win/draw/loss probabilities. Resolve draws to extra-time/penalties by picking the higher-Elo side, or a 50/50 toss when the Elo gap is under 25 points.
  5. Continue through R16, QF, SF, final. Record which team reached each stage.

With 10,000 paths, the Monte Carlo standard error on a 10% probability is roughly √(0.1·0.9/10,000) ≈ 0.3 percentage points — small enough that the headline numbers are stable across re-runs. We use a fixed seed (42) so the reported numbers are exactly reproducible.

The Monte Carlo runs are written to lib/probability-lab/predictions-snapshot.json before kick-off and committed to the repository. That file is the immutable audit trail — its git history is the proof we didn't edit predictions after the fact.

7. Accuracy metrics

After each match we will compute and publish:

  • Brier score (multi-class): mean squared error of the probability vector versus the one-hot outcome. Lower is better. Random guessing on three outcomes gives ~0.67; a perfect model gives 0.
  • Log-loss: negative log of the probability the model assigned to the actual outcome. Lower is better. Penalises confident wrong predictions heavily.
  • Calibration plot: bucket predictions by predicted probability (0-10%, 10-20%, …), then check the empirical frequency in each bucket. If the model is well-calibrated, the points hug the diagonal.
  • Accuracy on the predicted winner: percentage of matches where the favoured team actually won. A blunter measure but the one most readers immediately want to know.

8. Known limitations

  • Elo ratings are slow to update. A team peaking in form at tournament time (Croatia 2018, Morocco 2022) will be undervalued by Elo entering the tournament.
  • We do not model squad-level effects. A key injury in the days before a match cannot be reflected.
  • Knockout-stage Monte Carlo (v1.1) resolves draws to extra-time/penalties by a higher-Elo rule with a 50/50 toss for tight matches (<25 Elo gap). This is a simplification; in reality, penalty-shootout outcomes are closer to a true 50/50 with a small home-side effect. v1.2 will model extra time and penalties as separate stages.
  • The knockout-bracket pairing in v1.1's Monte Carlo is a randomised seeded pairing rather than the exact FIFA bracket. v1.2 will adopt the published bracket per group-position.
  • The ensemble averages two correlated models. A genuinely independent third model — a neural network on historical fixtures, perhaps — could materially improve calibration. v2.
  • No betting-market data, by design. If you wanted a number that's sharper than what we publish, the betting market is probably more accurate; we are intentionally not trying to compete with that. We are trying to publish a transparent, citeable, applied-probability artefact.

9. References

  • Dixon, M. J. & Coles, S. G. (1997). "Modelling Association Football Scores and Inefficiencies in the Football Betting Market." Journal of the Royal Statistical Society: Series C, 46(2): 265-280.
  • Elo, A. E. (1978). The Rating of Chessplayers, Past and Present. Arco. (The original Elo formulation; football adaptations widely documented since.)
  • Karlis, D. & Ntzoufras, I. (2003). "Analysis of Sports Data by Using Bivariate Poisson Models." Journal of the Royal Statistical Society: Series D, 52(3): 381-393.
  • FIFA. FIFA Men's World Ranking — Procedure. inside.fifa.com/fifa-world-ranking/men
  • World Football Elo Ratings. eloratings.net.

10. Reproducing this

Everything is open. The data file is at lib/probability-lab/world-cup-2026.ts and the prediction engine at lib/probability-lab/predictor.ts. Both have type signatures and inline comments matching the equations above. If you find an error, please open an issue or send a correction via the contribute page.