Soccer

Soccer Betting Model: xG Play-by-Play Simulation

The V16.3 PBP engine simulates soccer matches using FBref expected goals data, isotonic probability calibration, formation-aware tactics, and Both Teams To Score (BTTS) modeling. Covers the Premier League, La Liga, Bundesliga, Serie A, Ligue 1, Champions League, and international competitions.

View Today's Soccer Picks

Engine Overview

Soccer is a low-scoring sport with a three-way market: home win, draw, or away win. This creates unique modeling challenges. The draw outcome is inherently difficult to predict because it requires both teams to score the same number of goals, and the base rate for draws (roughly 25% across major leagues) means any model must handle it explicitly. Most soccer models either ignore draws or treat them as an afterthought. The V16.3 PBP engine models all three outcomes from first principles.

The engine is built on expected goals (xG), the single most predictive team-level metric in soccer analytics. xG measures the quality of scoring chances created, stripping out the randomness of finishing. A team that generates 2.5 xG per match but only scores 1.8 goals is due for positive regression. A team that outperforms its xG by a wide margin is due for negative regression. By simulating matches from xG distributions rather than goal distributions, the model captures true team quality and anticipates regression more quickly than models based on raw results.

How the Soccer Simulation Works

1. FBref Expected Goals (xG) Integration

FBref provides the most comprehensive publicly available xG data in soccer, covering all major European leagues and international competitions. The engine uses team-level xG for (offensive quality) and xG against (defensive quality) as the primary inputs for simulation.

These are not simple averages. The engine decomposes xG into shot-type categories: open play, set pieces (corners, free kicks), and penalties. Each category follows a different probability distribution. Open-play xG is relatively stable and predictive. Set-piece xG is higher-variance and partially coachable. Penalty xG depends on match-specific factors (playing style, referee tendencies). By modeling each category separately, the simulation produces more realistic goal distributions than models that use aggregate xG.

2. Score-State Simulation

Soccer tactics change dramatically based on the current score. A team leading 1-0 in the 70th minute will sit deeper, concede possession, and reduce their xG rate. A team trailing 1-0 will push forward, commit more players to attack, and increase both their xG for and their xG against. This score-state dependency is critical for accurate simulation.

The engine models minute-by-minute xG rates that adjust based on the current score differential. When the simulation places one team ahead, the subsequent minutes reflect the tactical shift: the leading team's defensive xG improves while their offensive xG decreases. Trailing teams see the opposite effect. This produces realistic game flow including the common pattern where late goals by trailing teams are more frequent than early goals by leading teams.

3. Three-Way Market Modeling

Soccer's three-way moneyline (1X2) requires explicit draw probability estimation. The engine handles this by simulating the full 90-minute match (plus stoppage time) and recording whether the result is a home win, draw, or away win. Draws emerge naturally from the simulation: they occur when both teams score the same number of goals after the xG-driven scoring process completes.

A critical modeling rule: in three-way moneyline betting, a draw is a LOSS for any non-draw bet, not a push. This is different from spread betting and must be accounted for in edge calculations. The engine tracks draw probability explicitly and includes it in the expected value calculation for moneyline bets.

4. BTTS (Both Teams To Score) Modeling

The BTTS market is one of the most popular soccer betting markets and requires its own modeling layer. The engine tracks whether each team scores at least one goal across all simulations. This is not simply a function of the total goals: a game with 3.0 expected total goals could have very different BTTS probabilities depending on whether the goals are evenly distributed (high BTTS) or concentrated in one team (low BTTS).

The BTTS model considers each team's scoring frequency (what percentage of games they score in), their opponent's clean sheet rate, and the specific xG matchup. Teams with high xG-for-per-game but facing a defensively solid opponent will have a different BTTS probability than the raw averages suggest.

5. Isotonic Calibration

Raw simulation probabilities for soccer systematically overestimate the probability of the modeled favorite and underestimate draw frequency. This is a known calibration issue in soccer modeling. The engine applies isotonic regression calibration trained on thousands of historical matches, mapping raw simulation probabilities to empirically observed outcome frequencies.

Isotonic calibration is a non-parametric method that preserves the rank ordering of probabilities while adjusting their absolute values. If the raw model says 55% home win probability but historically matches rated 55% home win only have 51% home win outcomes, the calibrator adjusts to 51%. This ensures that the probabilities used for edge detection and Kelly sizing are as accurate as possible.

6. Formation and Tactical Analysis

Different formations create different attacking and defensive profiles. A 4-3-3 generates more wing play and crossing chances, while a 3-5-2 tends to create chances through central combinations. The engine uses formation data to adjust xG distributions based on the specific tactical setup each team deploys.

Formation data is particularly valuable for cup matches, international fixtures, and games where a team is expected to change their usual setup (e.g., parking the bus against a significantly stronger opponent). These tactical adjustments can shift xG distributions by 10-15%, which is material for betting purposes.

League Coverage

Premier League

England's top flight. Highest-quality xG data, most liquid betting markets, 380 matches per season.

La Liga

Spain. Possession-heavy league with distinct home/away splits and strong tactical diversity.

Bundesliga

Germany. Highest-scoring major league, strong pressing culture, significant home advantage.

Serie A

Italy. Tactically sophisticated, lower-scoring, historically strong defensive coaching tradition.

Ligue 1

France. High variance between top and bottom teams creates spread and total value.

Champions League

Europe's premier club competition. Cross-league matchups create unique modeling opportunities.

Data Sources

Soccer Performance

V16.3
Engine Version
10,000
Sims Per Game
xG-Based
Core Metric
6 Leagues
Coverage

Explore Other Models

NBA Model

Possession MC V5.0.2 — possession-by-possession simulation with Beta shooting distributions.

CBB Model

Savant Ultra v5.0.1 — 5-on-5 player simulation with EvanMiya BPR ratings.

NHL Model

V19.1 Pinnacle — MoneyPuck xG with real danger zones and per-zone goalie modeling.

NFL Model

Elite V1.1 Pinnacle — EPA metrics with CDF edge calculation and drive sequencing.

MLB Model

Elite Matchup V4.2 — count-state simulation with catcher framing and park factors.

LoL Model

Championship v2.1 — 5-layer Glicko-2 with market blend and patch-aware meta.


See Today's Soccer Picks

View xG-driven soccer picks across major European leagues with isotonic-calibrated probabilities and Kelly-optimized sizing.

View Free Picks Go Premium - $19.99/mo