Every prediction runs through three layers that are combined into a final result.
1
Data collection
Last 10 matches each team + H2H history fetched from the API
β
2
Weighted team profiles
Decay weighting (0.9βΏ) β recent games count more
β
3
Poisson model
Expected goals β 6Γ6 score matrix β Win / Draw / Loss %
β
4
Rule-based markets
Goals, Corners, Cards, BTTS thresholds with confidence scores
β
5
ML overlay (optional)
Random Forest trained on real historical matches
What data is collected
- Last 10 home team matches from the API
- Last 10 away team matches
- Last 10 head-to-head meetings between the two teams
Data providers (35+ leagues)
| Provider | Leagues | Key required |
| ESPN (unofficial) | 11 leagues β PL, La Liga, Bundesliga, Serie A, Ligue 1, MLS, Eredivisie, Primeira Liga, Super Lig, Champions League, Europa League | No |
| OpenLigaDB | 3 German leagues β Bundesliga, 2. Bundesliga, 3. Liga | No |
| football-data.org | Top European leagues β PL, Bundesliga, Serie A, La Liga, Ligue 1, Eredivisie, Primeira Liga, BrasileirΓ£o, CL, EL | Free tier |
| API-Football.com | Championship, Ligue 2 + extras | Yes |
| TheSportsDB | MLS, BrasileirΓ£o, and others | Free tier |
Exponential decay weighting
Matches are sorted oldest β newest. Each gets a weight of 0.9βΏ, so the most recent game counts the most:
weights = [0.9β΄, 0.9Β³, 0.9Β², 0.9ΒΉ, 0.9β°]
= [0.656, 0.729, 0.810, 0.900, 1.000]
A goal last week is worth ~52% more than a goal from 5 games ago β the model reacts faster to streaks.
avgGoals = Ξ£(weight Γ goals) Γ· Ξ£(weights)
Daily Picks & Weekly Picks
The Today's Games landing page and Daily/Weekly Picks features automatically scan all 35+ leagues, predict every fixture, and surface the highest-confidence bets across all markets (goals, BTTS, corners, outcome). No manual league selection needed.
Step 1 β Expected goals (Ξ»)
Using a Dixon-Coles / Maher inspired formula:
Ξ»home = homeAttack Γ awayDefence Γ 1.1
Ξ»away = awayAttack Γ homeDefence
The 1.1 multiplier captures home advantage (+10%). If venue-specific stats are available it drops to 1.03 since the advantage is already in the split averages.
Step 2 β Poisson formula
Probability of scoring exactly k goals given expected rate Ξ»:
P(X = k) = eβΞ» Γ Ξ»k / k!
Example β Ξ» = 1.8 expected goals:
P(0 goals) = eβ»ΒΉΒ·βΈ Γ 1.8β° / 0! = 16.5%
P(1 goal) = eβ»ΒΉΒ·βΈ Γ 1.8ΒΉ / 1! = 29.8%
P(2 goals) = eβ»ΒΉΒ·βΈ Γ 1.8Β² / 2! = 26.8%
P(3 goals) = eβ»ΒΉΒ·βΈ Γ 1.8Β³ / 3! = 16.1%
Step 3 β Score matrix (6Γ6)
Home and away goals are treated as independent, so every scoreline probability is:
P(2β1) = P(home = 2) Γ P(away = 1)
This creates a 6Γ6 grid. The cell with the highest value = most likely score.
Step 4 β Sum the matrix
Home Win = Ξ£ P(h,a) where h > a
Draw = Ξ£ P(h,a) where h = a (diagonal)
Away Win = Ξ£ P(h,a) where h < a
Over / Under 2.5 goals
Under 2.5 = P(total β€ 2) = P(0) + P(1) + P(2)
Over 2.5 = 1 β Under 2.5
Both Teams to Score (BTTS)
BTTS Yes = P(home β₯ 1) Γ P(away β₯ 1)
= (1 β P(home=0)) Γ (1 β P(away=0))
Corners & Cards (rule-based)
| Market | Rule | Confidence |
| Corners O/U 9.5 | Over if combined avg > 9.5 | min(βdiffβ / 4, 1) |
| Cards O/U 3.5 | Over if combined avg > 3.5 | min(βdiffβ / 2, 1) |
| Goals O/U 2.5 | Over if combined avg > 2.5 | min(βdiffβ / 1.5, 1) |
Confidence = how far from the line. Combined avg 4.5 β confidence 1.0. Combined avg 2.6 β confidence 0.07.
Overall confidence level
< 5 matches for either team β LOW
5 β 8 matches β MEDIUM
> 8 matches for both teams β HIGH
An optional Python ML microservice runs alongside the Poisson engine and gives a second opinion on each market.
What it predicts
- Over / Under 2.5 goals
- Both Teams to Score (BTTS)
- Corners Over / Under 9.5
How it's trained
Features fed into the model:
home_avg_goals_for / against away_avg_goals_for / against
home_avg_corners away_avg_corners
home_form_rating (0β1) away_form_rating (0β1)
home_win_rate away_win_rate
h2h_avg_total_goals h2h_played
poisson_over25_prob poisson_btts_prob
Training data is built from real historical matches (PL, Bundesliga, Serie A, La Liga, Ligue 1 β 4 seasons). The rolling window ensures only pre-match information is used, so there is no data leakage.
Algorithm
Random Forest (200 trees, depth 8) β one classifier per target market. Evaluated with 5-fold cross-validation AUC. XGBoost is also available via the /train endpoint.
To enable
# Terminal 1 β start the ML service
cd ml && uvicorn app:app --port 8000
# Terminal 2 β build real training data (first time only)
node scripts/build-training-data.js
# .env β enable ML predictions
ML_SERVICE_URL=http://localhost:8000