Skip to content
← World Cup 2026 predictions
Backtest · 3 tournaments · 39 matches

What the model got right last time.

Before the 2026 World Cup starts, here's how the same Dixon-Coles + Elo ensemble would have done on three recently-completed major tournaments. Out-of-sample — pre-tournament Elo only, no hindsight.

77%

Winner accuracy

30 of 39

0.168

Mean Brier

Coin flip: 0.250

0.515

Mean log loss

Coin flip: 0.693

77%

Naive Elo baseline

Higher-Elo always

What the numbers mean — honestly.

77% of 39 matches is meaningfully better than chance ( 50% ) and better than what a random forecaster (Brier ≈ 0.25) would manage. It's in the same range as published academic models on comparable datasets.

The headline winner accuracy is the same as a naive baseline that always picks the higher-Elo side (77%). That's an honest finding: the Dixon-Coles addition doesn't move the picks much because both methods share the same Elo input. Where Dixon-Coles does help is in confidence calibration — the probability assigned to the winner — which shows up in the Brier and log-loss scores rather than the binary pick.

The misses are worth noting. The model failed to predict major upsets — Morocco beating Spain and Portugal at WC 2022, Switzerland beating Italy at Euro 2024, Uruguay beating Brazil at Copa 2024. Pre-tournament Elo simply doesn't capture the in-tournament form of a side hitting their stride. This is a known limitation, not a fixable one without live in-tournament updates.

Sample size warning. 39 matches is not enough to draw strong statistical conclusions about model calibration. The 95% confidence interval on a 77% accuracy estimate from 39 trials is roughly ±13 percentage points. Three tournaments' worth of knockout matches is illustrative, not definitive.

2022 · 16 matches

2022 FIFA World Cup

First World Cup in the Middle East. Argentina won their third title, beating France on penalties after a 3-3 draw. 16 knockout matches.

Accuracy

81.3%

13 of 16

Mean Brier

0.167

Mean log loss

0.519

vs naive Elo

81%

RoundMatchPredictedActualConf.
R16Netherlands v USANetherlandsNetherlands50%
R16Argentina v AustraliaArgentinaArgentina58%
R16France v PolandFranceFrance51%
R16England v SenegalEnglandEngland49%
R16Japan v CroatiapensCroatiaCroatia71%
R16Brazil v KoreaBrazilBrazil60%
R16Morocco v SpainpensSpainMorocco47%
R16Portugal v SwitzerlandPortugalPortugal48%
QFCroatia v BrazilpensBrazilCroatia47%
QFNetherlands v ArgentinapensArgentinaArgentina71%
QFMorocco v PortugalPortugalMorocco23%
QFEngland v FranceFranceFrance42%
SFArgentina v CroatiaArgentinaArgentina52%
SFFrance v MoroccoFranceFrance51%
3PCroatia v MoroccoCroatiaCroatia44%
FArgentina v FrancepensArgentinaArgentina72%

2024 · 15 matches

UEFA Euro 2024

Hosted by Germany. Spain won a record fourth Euros, beating England 2-1 in the final at Berlin's Olympiastadion. 15 knockout matches (no 3rd-place game).

Accuracy

73.3%

11 of 15

Mean Brier

0.168

Mean log loss

0.507

vs naive Elo

73%

RoundMatchPredictedActualConf.
R16Switzerland v ItalyItalySwitzerland36%
R16Germany v DenmarkGermanyGermany41%
R16England v SlovakiaEnglandEngland55%
R16Spain v GeorgiaSpainSpain61%
R16France v BelgiumFranceFrance38%
R16Portugal v SloveniapensPortugalPortugal83%
R16Romania v NetherlandsNetherlandsNetherlands56%
R16Austria v TürkiyeAustriaTürkiye33%
QFSpain v GermanySpainSpain43%
QFPortugal v FrancepensPortugalFrance62%
QFNetherlands v TürkiyeNetherlandsNetherlands51%
QFEngland v SwitzerlandpensEnglandEngland70%
SFSpain v FranceSpainSpain40%
SFNetherlands v EnglandNetherlandsEngland34%
FSpain v EnglandSpainSpain42%

2024 · 8 matches

2024 Copa América

Hosted by the United States. Argentina won a record 16th title, beating Colombia 1-0 in extra time at Hard Rock Stadium, Miami. 8 knockout matches.

Accuracy

75.0%

6 of 8

Mean Brier

0.173

Mean log loss

0.519

vs naive Elo

75%

RoundMatchPredictedActualConf.
QFArgentina v EcuadorpensArgentinaArgentina82%
QFVenezuela v CanadapensCanadaCanada65%
QFColombia v PanamaColombiaColombia50%
QFUruguay v BrazilpensBrazilUruguay53%
SFArgentina v CanadaArgentinaArgentina58%
SFColombia v UruguayUruguayColombia27%
3PUruguay v CanadapensUruguayUruguay76%
FArgentina v ColombiaArgentinaArgentina55%

How the backtest works.

  1. Take each team's Elo rating as of the start of the tournament. Sources: eloratings.net historical snapshots, approximate to ±25 Elo where the exact snapshot wasn't recoverable. Same scale we use for the 2026 predictions.
  2. For every knockout match, run the same Dixon-Coles + Elo ensemble that powers the live WC 2026 page. The model produces P(home win), P(draw), P(away win) for regulation time.
  3. Since knockout matches must produce a winner, collapse the draw probability to whichever side has the higher regulation win probability. This matches how the live bracket simulator resolves draws.
  4. Compare the model's pick to the actual winner (including penalty-shootout outcomes). Compute Brier score and log loss against the binary outcome.
  5. As a sanity baseline, also compute what a naive "always pick higher-Elo" rule would have predicted. If the model doesn't beat this baseline materially, that's an honest finding worth reporting — not hidden.

The backtest dataset lives in lib/probability-lab/backtest-data.ts and the runner in lib/probability-lab/backtest-runner.ts. Both are committed to git — any reader can verify the predictions match what the live ensemble produces.