World Cup 2026 Predictions — Methodology | Probability Lab

1. The problem

Given a fixture between two football teams, predict three things: the probability the home team wins, the probability of a draw, and the probability the away team wins. As a secondary output, predict the most likely scoreline and an expected-goals estimate for each side.

Doing this honestly requires (a) a strength measure for each team, (b) a scoring model that translates strength into goal expectations, (c) a stochastic model for goal counts, and (d) some adjustment for home advantage. We use two independent specifications and present them side-by-side.

2. Inputs

Each of the 48 qualified teams has two strength estimates:

FIFA Men's World Ranking points as of 2026-05-14. Computed by FIFA using their points-exchange formula introduced in 2018 (modelled on Elo). The April 2026 ranking was the last update before the tournament; the next one (9 June 2026) lands just before kick-off.
Approximate Elo ratings from eloratings.net. Elo updates after every match, so we use the early-April 2026 snapshot to keep the cut-off consistent with FIFA.

We use Elo as the active rating in both models. FIFA points are reported for transparency and calibration checks. Where a small federation lacks a reliable Elo, we estimate from CONFED ranks and recent qualifier results and flag the row as ESTIMATED.

We deliberately do not use: bookmaker odds (would defeat the point), squad transfer-market valuations (proprietary and unstable), injury or fitness data (not reliably available before kick-off), or recent-form metrics other than what is already absorbed by the Elo updating process.

3. Model 1 — Dixon-Coles Poisson

From Dixon & Coles (1997, Applied Statistics), the canonical match-prediction model in academic football statistics. Assume each team scores goals as a Poisson process with rate λ. The joint distribution of (home goals, away goals) is approximately the product of two independent Poissons, with a small correction for low-score correlation.

For a match where Elo_H is the home Elo and Elo_A the away Elo:

λ_baseline(elo) = 1.35 + 0.40 · (elo - 1500) / 500
λ_H = max(0.05, λ_baseline(Elo_H) + home_adv(H) - 0.5 · λ_baseline(Elo_A) + 0.5 · 1.35)
λ_A = max(0.05, λ_baseline(Elo_A) + away_adv(A) - 0.5 · λ_baseline(Elo_H) + 0.5 · 1.35)

P(home h, away a) = Poisson(h; λ_H) · Poisson(a; λ_A) · τ(h, a)

τ(0,0) = 1 - λ_H · λ_A · ρ      with ρ = -0.12
τ(0,1) = 1 + λ_H · ρ
τ(1,0) = 1 + λ_A · ρ
τ(1,1) = 1 - ρ
τ(h,a) = 1   otherwise

The cross-term subtracts half the opponent's baseline rate so a strong team facing a strong defence still scores less than the same strong team facing a weak defence. This isn't the original Dixon-Coles attack/defence parameterisation — we collapse to a single Elo per team for simplicity — but it produces statistically similar outputs at the international-fixture level.

The τ adjustment is the Dixon-Coles correction. Without it, independent Poisson under-predicts draws (specifically 0-0 and 1-1) at the rates actually observed in football data. ρ = -0.12 is the value the original paper estimated, replicated by many subsequent studies.

Outcome probabilities sum the score grid:

P(home win) = Σ_{h > a} P(h, a)
P(draw)     = Σ_{h = a} P(h, a)
P(away win) = Σ_{h < a} P(h, a)

4. Model 2 — Elo baseline

The simpler companion model uses the standard Elo win-probability formula. With effective Elo ratings adjusted for home advantage (see §5):

Δ = Elo_H_eff - Elo_A_eff
p_home_no_draw = 1 / (1 + 10^(-Δ / 400))
p_draw = 0.25     (empirical base rate for World Cup matches)
P(home win) = p_home_no_draw · (1 - p_draw)
P(away win) = (1 - p_home_no_draw) · (1 - p_draw)

λ_H = max(0.15, 1.35 + Δ/1000 + home_adv(H))
λ_A = max(0.15, 1.35 - Δ/1000 + home_adv(A))

This is a textbook Elo system with a flat draw share. It will under-fit certain effects (no low-score correlation, draws not endogenous to team strength) but it doubles as a sanity check on Dixon-Coles. When the two models agree, we're more confident. When they disagree by more than 10 percentage points, we flag the match.

5. Home advantage

Three regimes:

Host home advantage (Mexico, Canada, USA playing on their own soil): +0.30 expected goals (Dixon-Coles) or +100 Elo points (Elo baseline). The continental average for the empirical World Cup home effect is between 0.25 and 0.40 goals; we sit at the middle.
Continental home advantage (CONCACAF teams playing on CONCACAF soil but not at their literal home): +0.10 goals or +40 Elo. Mexico playing in Boston gets a smaller boost than Mexico playing in Mexico City.
Neutral: zero. A European team playing another European team in Toronto gets no home effect.

We do not adjust for altitude, climate, or specific venue effects in v1. These are real but second-order and difficult to defend without far more data than we have.

6. Group-stage standings

For each group, expected points per team is the sum across three matches of:

xPts = 3 · P(win) + 1 · P(draw)

The expected-points table gives a reasonable ordering of likely group winners and runners-up but it underestimates variance. A team with xPts=4.5 and another with xPts=4.2 are statistically very close; the difference is small relative to the standard error.

6½. Tournament-wide Monte Carlo (v1.1)

For knockout-stage projections and the "champion%" numbers on the front page, we run a full tournament simulation 10,000 times. In each path:

For every group-stage fixture, sample home and away goals from independent Poissons with the Dixon-Coles λ values. Tally points, GD, GF.
Determine the top 2 from each group plus the 8 best 3rd-placed teams (sorted by points, GD, GF with a uniform random tie-breaker).
Pair the 32 qualified teams into the knockout bracket (v1.1 uses randomised seeded pairing; v1.2 will adopt the exact FIFA bracket pairing once published per group).
For each knockout match, sample the outcome from the ensemble win/draw/loss probabilities. Resolve draws to extra-time/penalties by picking the higher-Elo side, or a 50/50 toss when the Elo gap is under 25 points.
Continue through R16, QF, SF, final. Record which team reached each stage.

With 10,000 paths, the Monte Carlo standard error on a 10% probability is roughly √(0.1·0.9/10,000) ≈ 0.3 percentage points — small enough that the headline numbers are stable across re-runs. We use a fixed seed (42) so the reported numbers are exactly reproducible.

The Monte Carlo runs are written to lib/probability-lab/predictions-snapshot.json before kick-off and committed to the repository. That file is the immutable audit trail — its git history is the proof we didn't edit predictions after the fact.

7. Accuracy metrics

After each match we will compute and publish:

Brier score (multi-class): mean squared error of the probability vector versus the one-hot outcome. Lower is better. Random guessing on three outcomes gives ~0.67; a perfect model gives 0.
Log-loss: negative log of the probability the model assigned to the actual outcome. Lower is better. Penalises confident wrong predictions heavily.
Calibration plot: bucket predictions by predicted probability (0-10%, 10-20%, …), then check the empirical frequency in each bucket. If the model is well-calibrated, the points hug the diagonal.
Accuracy on the predicted winner: percentage of matches where the favoured team actually won. A blunter measure but the one most readers immediately want to know.

8. Known limitations

Elo ratings are slow to update. A team peaking in form at tournament time (Croatia 2018, Morocco 2022) will be undervalued by Elo entering the tournament.
We do not model squad-level effects. A key injury in the days before a match cannot be reflected.
Knockout-stage Monte Carlo (v1.1) resolves draws to extra-time/penalties by a higher-Elo rule with a 50/50 toss for tight matches (<25 Elo gap). This is a simplification; in reality, penalty-shootout outcomes are closer to a true 50/50 with a small home-side effect. v1.2 will model extra time and penalties as separate stages.
The knockout-bracket pairing in v1.1's Monte Carlo is a randomised seeded pairing rather than the exact FIFA bracket. v1.2 will adopt the published bracket per group-position.
The ensemble averages two correlated models. A genuinely independent third model — a neural network on historical fixtures, perhaps — could materially improve calibration. v2.
No betting-market data, by design. If you wanted a number that's sharper than what we publish, the betting market is probably more accurate; we are intentionally not trying to compete with that. We are trying to publish a transparent, citeable, applied-probability artefact.

9. References

Dixon, M. J. & Coles, S. G. (1997). "Modelling Association Football Scores and Inefficiencies in the Football Betting Market." Journal of the Royal Statistical Society: Series C, 46(2): 265-280.
Elo, A. E. (1978). The Rating of Chessplayers, Past and Present. Arco. (The original Elo formulation; football adaptations widely documented since.)
Karlis, D. & Ntzoufras, I. (2003). "Analysis of Sports Data by Using Bivariate Poisson Models." Journal of the Royal Statistical Society: Series D, 52(3): 381-393.
FIFA. FIFA Men's World Ranking — Procedure. inside.fifa.com/fifa-world-ranking/men
World Football Elo Ratings. eloratings.net.

10. Reproducing this

Everything is open. The data file is at lib/probability-lab/world-cup-2026.ts and the prediction engine at lib/probability-lab/predictor.ts. Both have type signatures and inline comments matching the equations above. If you find an error, please open an issue or send a correction via the contribute page.

Methodology.