Expected goals (xG) models the probability each shot scores: P(goal) = 1 / (1 + e^-(beta_0 + beta_1distance + beta_2angle + …)). Sum shot-level xG to get match-level expected goals, feed that into a Poisson model for match outcome probabilities, and compare against market odds to find edge.
Why This Matters for Agents
An autonomous soccer betting agent needs a way to convert raw match data into win/draw/loss probabilities that are more accurate than the market. Raw goals are terrible for this — they’re high-variance Poisson events where a team averaging 1.3 goals per match scores zero 27% of the time. An agent relying on goal counts needs 50+ matches to get stable estimates. xG cuts that to 8-10 matches by measuring shot quality rather than binary outcomes.
This is Layer 4 — Intelligence. The xG model sits at the core of the agent’s prediction engine. The pipeline: pull historical shot data from StatsBomb, train a logistic regression xG model, aggregate to match-level xG, parameterize a Poisson scoring model, compute match outcome probabilities, compare against live odds from The Odds API, and flag positive EV bets. The output feeds directly into Kelly sizing for bankroll-optimal stake calculation. Every component of the Agent Betting Stack downstream depends on the quality of this Layer 4 model.
The Math
What xG Measures
Expected goals quantifies shot quality. For any shot i, xG_i is the probability that shot results in a goal, conditional on observable features:
xG_i = P(goal | distance_i, angle_i, body_part_i, assist_type_i, game_state_i, ...)
A penalty kick has xG ≈ 0.76. A header from 12 meters off a cross has xG ≈ 0.04. A shot from the edge of the 6-yard box with the keeper out of position has xG ≈ 0.45. The model assigns each shot a probability based on historical conversion rates from similar positions and contexts.
Match-level xG is the sum of all shot-level xG values:
match_xG = Σ xG_i for all shots i in the match
If a team takes 14 shots with individual xG values of [0.03, 0.07, 0.45, 0.02, 0.11, 0.08, 0.04, 0.76, 0.06, 0.03, 0.12, 0.09, 0.05, 0.02], their match xG = 1.93. They “deserved” roughly 1.93 goals based on the quality of chances created.
The Logistic Regression Model
The standard xG model is logistic regression. The target is binary: goal (1) or no goal (0). The model estimates:
P(goal) = 1 / (1 + e^-(β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ))
Where the sigmoid function maps any linear combination of features to the (0, 1) probability range.
The core features, ranked by predictive importance:
| Feature | Variable | Description | Expected Sign |
|---|---|---|---|
| Distance | x₁ | Meters from shot location to goal center | Negative (farther = less likely) |
| Angle | x₂ | Angle in radians between shot location and both goalposts | Positive (wider angle = more likely) |
| Body part | x₃ | Binary: 1 = foot, 0 = head/other | Positive (foot shots convert better) |
| Assist type | x₄-x₇ | One-hot: through ball, cross, set piece, none | Through ball positive, cross negative |
| Game state | x₈ | Goal difference at time of shot | Slight negative (leading teams take lower-quality shots) |
| Fast break | x₉ | Binary: 1 = counter-attack, 0 = settled play | Positive (disorganized defense) |
| Big chance | x₁₀ | Binary: 1 = one-on-one or open goal | Strong positive |
Computing Shot Angle and Distance
Shot angle — the angle subtended by the two goalposts from the shot location — is the single most important geometric feature. Given shot coordinates (x, y) on a pitch where the goal line is at x = 0 and the goal spans y ∈ [-3.66, 3.66] (standard 7.32m goal width):
Goal posts: P₁ = (0, -3.66), P₂ = (0, 3.66)
Distance to goal center:
d = sqrt(x² + y²)
Angle (using the law of cosines):
a = |P₁ - shot|, b = |P₂ - shot|, c = |P₁ - P₂| = 7.32m
θ = arccos((a² + b² - c²) / (2ab))
Larger angles mean the shooter “sees” more of the goal. A shot from directly in front at 6 meters has a wide angle (~1.05 rad). A shot from a tight angle at the same distance might have θ ≈ 0.25 rad.
From Shot xG to Match Outcomes: The Poisson Bridge
Individual shot xG values aggregate to team-level scoring rates. The key insight: goals in soccer follow a Poisson distribution with parameter λ equal to the team’s xG rate.
For a match between team A (home) and team B (away):
λ_home = home_attack_strength × away_defense_weakness × league_home_avg
λ_away = away_attack_strength × home_defense_weakness × league_away_avg
Where attack strength = team xG per match / league average xG, and defense weakness = opponent xG conceded per match / league average xG conceded.
The probability of any specific scoreline (h goals for home, a goals for away):
P(home = h, away = a) = Poisson(h | λ_home) × Poisson(a | λ_away)
Where Poisson(k | λ) = (λ^k × e^(-λ)) / k!
This assumes independence between home and away scoring — a simplification that holds reasonably well empirically. For the full Poisson derivation and independence assumption analysis, see the Poisson Distribution in Sports Modeling guide.
Match Outcome Probabilities
Sum the scoreline probabilities across all outcomes where home wins, draws, or loses:
P(home win) = Σ P(h, a) for all h > a
P(draw) = Σ P(h, a) for all h = a
P(away win) = Σ P(h, a) for all h < a
In practice, truncate at 8 or 10 goals per side — the probability of any team scoring 10+ goals is negligible (<0.001% for any realistic λ).
Worked Examples
Example 1: Building xG Rates from Season Data
Manchester City 2024-25 Premier League through 25 matches (data from FBref):
Man City: xG for = 48.3, xG against = 22.1 (25 matches)
Per match: xGF = 1.93, xGA = 0.88
League averages: avg xGF = 1.35, avg xGA = 1.35 (by definition equal)
Home advantage factor: 1.24 (home teams score ~24% more in the Premier League)
Attack strength = 1.93 / 1.35 = 1.430
Defense strength = 0.88 / 1.35 = 0.652
Liverpool 2024-25 through 25 matches:
Liverpool: xG for = 52.8, xG against = 24.5
Per match: xGF = 2.11, xGA = 0.98
Attack strength = 2.11 / 1.35 = 1.563
Defense strength = 0.98 / 1.35 = 0.726
For Man City (home) vs Liverpool (away):
λ_home = City_attack × Liverpool_def_weakness × home_avg
= 1.430 × 0.726 × (1.35 × 1.24)
= 1.430 × 0.726 × 1.674
= 1.738
λ_away = Liverpool_attack × City_def_weakness × away_avg
= 1.563 × 0.652 × (1.35 / 1.24)
= 1.563 × 0.652 × 1.089
= 1.109
Scoreline probabilities (Poisson):
Score P(score) Cumulative category
0-0 0.0309 Draw
1-0 0.0537 Home win
0-1 0.0343 Away win
2-0 0.0467 Home win
1-1 0.0596 Draw
0-2 0.0190 Away win
2-1 0.0519 Home win
3-0 0.0271 Home win
...
P(Man City win) = 0.486
P(Draw) = 0.214
P(Liverpool win) = 0.300
Now compare against market odds. BetOnline might have Man City at -125 (implied 55.6%), Draw at +300 (25.0%), Liverpool at +250 (28.6%). The model says City at 48.6% — the market is overpricing City by 7 percentage points. An agent flags Liverpool or Draw as potential value.
Example 2: Exploiting xG-to-Goals Divergence
Brighton 2024-25 through 20 matches:
Actual goals scored: 34 (1.70 per match)
xG: 28.6 (1.43 per match)
Overperformance: +5.4 goals (+0.27 per match)
Brighton is outscoring their xG by 18.9%. This kind of finishing overperformance regresses. The empirical data: teams overperforming xG by >15% over 20-match windows revert to within 5% of xG over the next 15 matches roughly 78% of the time.
An agent’s strategy: fade Brighton in totals markets. If the market sets Brighton’s match total at 1.75 based on actual goals, but xG says 1.43, there’s value on the Under. On a sportsbook like BetOnline, Brighton total goals Under 1.5 at +115 (implied 46.5%) is worth investigating when the xG-based Poisson model gives:
P(Brighton scores 0) = e^(-1.43) = 0.239
P(Brighton scores 1) = 1.43 × e^(-1.43) = 0.342
P(Under 1.5) = 0.239 + 0.342 = 0.581
Model says 58.1%, market says 46.5%. That’s +11.6 percentage points of edge — a strong bet by any standard.
Implementation
import numpy as np
from scipy.stats import poisson
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score
from dataclasses import dataclass
@dataclass
class ShotFeatures:
"""Features for a single shot."""
distance: float # meters from goal center
angle: float # radians, angle subtended by goalposts
is_foot: int # 1 = foot, 0 = head/other
is_through_ball: int # 1 = assisted by through ball
is_cross: int # 1 = assisted by cross
is_set_piece: int # 1 = from set piece
is_fast_break: int # 1 = counter-attack
game_state: int # goal difference at time of shot (positive = winning)
is_big_chance: int # 1 = one-on-one or open goal
def compute_shot_angle(x: float, y: float, goal_width: float = 7.32) -> float:
"""
Compute the angle subtended by the goalposts from shot location.
Pitch coordinates: goal line at x=0, goal center at y=0.
Goal posts at (0, -goal_width/2) and (0, +goal_width/2).
Args:
x: Distance from goal line (meters, positive into pitch)
y: Lateral position (meters, 0 = center of goal)
goal_width: Width of goal in meters (default 7.32)
Returns:
Angle in radians
"""
half_w = goal_width / 2
# Vectors from shot to each post
a_sq = x**2 + (y - half_w)**2 # distance squared to right post
b_sq = x**2 + (y + half_w)**2 # distance squared to left post
c_sq = goal_width**2 # distance between posts squared
a = np.sqrt(a_sq)
b = np.sqrt(b_sq)
# Law of cosines: c² = a² + b² - 2ab*cos(θ)
cos_theta = (a_sq + b_sq - c_sq) / (2 * a * b)
cos_theta = np.clip(cos_theta, -1.0, 1.0) # numerical safety
return np.arccos(cos_theta)
def compute_distance(x: float, y: float) -> float:
"""Distance from shot location to goal center (0, 0)."""
return np.sqrt(x**2 + y**2)
class XGModel:
"""
Expected Goals model using logistic regression.
Train on shot-level data with binary target (1=goal, 0=no goal).
Predict xG for new shots. Aggregate to match-level xG.
"""
def __init__(self):
self.model = LogisticRegression(
max_iter=1000,
C=1.0, # regularization strength
solver='lbfgs'
)
self.is_fitted = False
def _features_to_array(self, shots: list[ShotFeatures]) -> np.ndarray:
"""Convert list of ShotFeatures to numpy array."""
return np.array([
[s.distance, s.angle, s.is_foot, s.is_through_ball,
s.is_cross, s.is_set_piece, s.is_fast_break,
s.game_state, s.is_big_chance]
for s in shots
])
def fit(self, shots: list[ShotFeatures], goals: list[int]) -> dict:
"""
Train the xG model on historical shot data.
Args:
shots: List of ShotFeatures for each shot
goals: List of 0/1 indicating if each shot was a goal
Returns:
Dict with training metrics
"""
X = self._features_to_array(shots)
y = np.array(goals)
self.model.fit(X, y)
self.is_fitted = True
# Cross-validated log loss
cv_scores = cross_val_score(
self.model, X, y, cv=5, scoring='neg_log_loss'
)
feature_names = [
'distance', 'angle', 'is_foot', 'through_ball',
'cross', 'set_piece', 'fast_break', 'game_state', 'big_chance'
]
return {
'n_shots': len(goals),
'goal_rate': np.mean(y),
'cv_log_loss': -cv_scores.mean(),
'coefficients': dict(zip(feature_names, self.model.coef_[0])),
'intercept': self.model.intercept_[0]
}
def predict_xg(self, shots: list[ShotFeatures]) -> np.ndarray:
"""Predict xG for each shot. Returns array of probabilities."""
if not self.is_fitted:
raise RuntimeError("Model not fitted. Call fit() first.")
X = self._features_to_array(shots)
return self.model.predict_proba(X)[:, 1]
def match_xg(self, shots: list[ShotFeatures]) -> float:
"""Sum of xG for all shots — the match-level expected goals."""
return float(np.sum(self.predict_xg(shots)))
class PoissonMatchModel:
"""
Match outcome model using Poisson-distributed goals.
Takes team xG rates as lambda parameters.
Computes scoreline and outcome probabilities.
"""
def __init__(self, max_goals: int = 8):
self.max_goals = max_goals
def compute_lambda(
self,
team_xg_per_match: float,
opponent_xg_conceded_per_match: float,
league_avg_goals: float = 1.35,
home_advantage: float = 1.24,
is_home: bool = True
) -> float:
"""
Compute expected goals (lambda) for a team in a specific match.
Args:
team_xg_per_match: Team's average xG scored per match
opponent_xg_conceded_per_match: Opponent's avg xG conceded per match
league_avg_goals: League average goals per team per match
home_advantage: Multiplicative home advantage factor
is_home: Whether this team is playing at home
Returns:
Lambda (expected goals) for the Poisson distribution
"""
attack = team_xg_per_match / league_avg_goals
defense = opponent_xg_conceded_per_match / league_avg_goals
if is_home:
base = league_avg_goals * home_advantage
else:
base = league_avg_goals / home_advantage
return attack * defense * base
def scoreline_matrix(
self,
lambda_home: float,
lambda_away: float
) -> np.ndarray:
"""
Compute probability matrix for all scorelines up to max_goals.
Returns:
2D array where [h][a] = P(home scores h, away scores a)
"""
home_probs = poisson.pmf(range(self.max_goals + 1), lambda_home)
away_probs = poisson.pmf(range(self.max_goals + 1), lambda_away)
return np.outer(home_probs, away_probs)
def match_probabilities(
self,
lambda_home: float,
lambda_away: float
) -> dict:
"""
Compute home win, draw, away win probabilities.
Returns:
Dict with 'home_win', 'draw', 'away_win' probabilities
and top scorelines.
"""
matrix = self.scoreline_matrix(lambda_home, lambda_away)
n = self.max_goals + 1
home_win = sum(
matrix[h][a] for h in range(n) for a in range(n) if h > a
)
draw = sum(matrix[h][h] for h in range(n))
away_win = sum(
matrix[h][a] for h in range(n) for a in range(n) if h < a
)
# Top 5 most likely scorelines
scorelines = []
for h in range(n):
for a in range(n):
scorelines.append((h, a, matrix[h][a]))
scorelines.sort(key=lambda x: -x[2])
return {
'home_win': home_win,
'draw': draw,
'away_win': away_win,
'lambda_home': lambda_home,
'lambda_away': lambda_away,
'top_scorelines': [
{'score': f"{h}-{a}", 'prob': p}
for h, a, p in scorelines[:5]
]
}
def over_under(
self,
lambda_home: float,
lambda_away: float,
line: float = 2.5
) -> dict:
"""
Compute over/under probabilities for total goals line.
Args:
lambda_home: Home team expected goals
lambda_away: Away team expected goals
line: Total goals line (e.g., 2.5)
Returns:
Dict with 'over' and 'under' probabilities
"""
matrix = self.scoreline_matrix(lambda_home, lambda_away)
n = self.max_goals + 1
under = sum(
matrix[h][a]
for h in range(n) for a in range(n)
if (h + a) < line
)
over = 1.0 - under
return {'over': over, 'under': under, 'line': line}
def xg_divergence_analysis(
actual_goals: list[int],
match_xg: list[float],
team_name: str = "Team"
) -> dict:
"""
Analyze the divergence between actual goals and xG.
Identifies overperformance/underperformance and estimates
regression probability.
Args:
actual_goals: List of actual goals scored per match
match_xg: List of match-level xG values
team_name: Team name for reporting
Returns:
Analysis dict with divergence metrics
"""
n = len(actual_goals)
total_goals = sum(actual_goals)
total_xg = sum(match_xg)
goals_per_match = total_goals / n
xg_per_match = total_xg / n
divergence = total_goals - total_xg
divergence_pct = (divergence / total_xg) * 100 if total_xg > 0 else 0
# Probability of observed goals under xG-based Poisson model
# For each match, compute P(actual goals | xG) under Poisson
match_probs = [
poisson.pmf(g, xg) for g, xg in zip(actual_goals, match_xg)
]
log_likelihood = np.sum(np.log(np.array(match_probs) + 1e-10))
return {
'team': team_name,
'matches': n,
'total_goals': total_goals,
'total_xg': round(total_xg, 2),
'goals_per_match': round(goals_per_match, 2),
'xg_per_match': round(xg_per_match, 2),
'divergence': round(divergence, 2),
'divergence_pct': round(divergence_pct, 1),
'direction': 'overperforming' if divergence > 0 else 'underperforming',
'log_likelihood': round(log_likelihood, 2),
'regression_signal': abs(divergence_pct) > 15
}
def find_value_bets(
model_probs: dict,
market_odds: dict,
min_edge: float = 0.03
) -> list[dict]:
"""
Compare model probabilities against market odds to find value.
Args:
model_probs: Dict with 'home_win', 'draw', 'away_win' probabilities
market_odds: Dict with same keys, values are decimal odds
min_edge: Minimum probability edge to flag (default 3%)
Returns:
List of value bet opportunities
"""
value_bets = []
for outcome in ['home_win', 'draw', 'away_win']:
model_p = model_probs[outcome]
decimal_odds = market_odds[outcome]
implied_p = 1.0 / decimal_odds
edge = model_p - implied_p
ev = model_p * (decimal_odds - 1) - (1 - model_p)
if edge >= min_edge:
value_bets.append({
'outcome': outcome,
'model_prob': round(model_p, 4),
'implied_prob': round(implied_p, 4),
'edge': round(edge, 4),
'decimal_odds': decimal_odds,
'ev_per_unit': round(ev, 4),
'kelly_fraction': round(edge / (decimal_odds - 1), 4)
})
value_bets.sort(key=lambda x: -x['edge'])
return value_bets
# --- Example usage ---
if __name__ == "__main__":
# Generate synthetic training data (in production, use StatsBomb)
np.random.seed(42)
n_shots = 5000
distances = np.random.uniform(3, 35, n_shots)
laterals = np.random.uniform(-15, 15, n_shots)
angles = np.array([
compute_shot_angle(d, y) for d, y in zip(distances, laterals)
])
is_foot = np.random.binomial(1, 0.72, n_shots)
is_through = np.random.binomial(1, 0.08, n_shots)
is_cross = np.random.binomial(1, 0.15, n_shots)
is_set_piece = np.random.binomial(1, 0.12, n_shots)
is_fast_break = np.random.binomial(1, 0.06, n_shots)
game_state = np.random.choice([-2, -1, 0, 1, 2], n_shots, p=[0.05, 0.15, 0.50, 0.20, 0.10])
is_big_chance = np.random.binomial(1, 0.05, n_shots)
# Simulate goals based on realistic xG relationship
log_odds = (
-1.5
- 0.08 * distances
+ 1.2 * angles
+ 0.3 * is_foot
+ 0.5 * is_through
- 0.2 * is_cross
+ 0.0 * is_set_piece
+ 0.4 * is_fast_break
- 0.05 * game_state
+ 1.8 * is_big_chance
)
true_probs = 1 / (1 + np.exp(-log_odds))
goals = np.random.binomial(1, true_probs)
shots = [
ShotFeatures(
distance=distances[i], angle=angles[i], is_foot=int(is_foot[i]),
is_through_ball=int(is_through[i]), is_cross=int(is_cross[i]),
is_set_piece=int(is_set_piece[i]), is_fast_break=int(is_fast_break[i]),
game_state=int(game_state[i]), is_big_chance=int(is_big_chance[i])
)
for i in range(n_shots)
]
# Train xG model
xg_model = XGModel()
metrics = xg_model.fit(shots, goals.tolist())
print("=== xG Model Training ===")
print(f"Shots: {metrics['n_shots']}, Goal rate: {metrics['goal_rate']:.3f}")
print(f"CV Log Loss: {metrics['cv_log_loss']:.4f}")
print(f"Coefficients: {metrics['coefficients']}")
# Match outcome prediction: Man City vs Liverpool
match_model = PoissonMatchModel()
lambda_city = match_model.compute_lambda(
team_xg_per_match=1.93,
opponent_xg_conceded_per_match=0.98,
is_home=True
)
lambda_liverpool = match_model.compute_lambda(
team_xg_per_match=2.11,
opponent_xg_conceded_per_match=0.88,
is_home=False
)
print(f"\n=== Man City (H) vs Liverpool (A) ===")
print(f"Lambda City: {lambda_city:.3f}, Lambda Liverpool: {lambda_liverpool:.3f}")
result = match_model.match_probabilities(lambda_city, lambda_liverpool)
print(f"Home win: {result['home_win']:.3f}")
print(f"Draw: {result['draw']:.3f}")
print(f"Away win: {result['away_win']:.3f}")
print(f"Top scorelines: {result['top_scorelines'][:3]}")
ou = match_model.over_under(lambda_city, lambda_liverpool, line=2.5)
print(f"Over 2.5: {ou['over']:.3f}, Under 2.5: {ou['under']:.3f}")
# Value bet detection
market_odds = {
'home_win': 1.80, # -125 American
'draw': 4.00, # +300
'away_win': 3.50 # +250
}
value = find_value_bets(result, market_odds, min_edge=0.02)
print(f"\n=== Value Bets (min 2% edge) ===")
for v in value:
print(f" {v['outcome']}: model={v['model_prob']:.1%}, "
f"implied={v['implied_prob']:.1%}, edge={v['edge']:.1%}, "
f"EV={v['ev_per_unit']:.3f}")
Limitations and Edge Cases
Small sample xG is unstable. A team’s xG per match stabilizes after roughly 8-10 league matches. For international teams that play 10-15 competitive matches per year, xG estimates carry wide confidence intervals. An agent betting on World Cup 2026 matches must use Bayesian priors from club-level data to supplement sparse international xG data — the World Cup 2026 Betting Math guide covers this approach.
xG ignores goalkeeper quality. Standard xG models condition on shot location and context but not on which goalkeeper is in net. A shot with xG = 0.15 against Thibaut Courtois converts at a different rate than the same shot against a League Two keeper. Post-shot xG (PSxG) models account for shot placement and keeper positioning but require video-derived data that most free sources lack.
xG does not capture non-shot situations. Dangerous attacks that don’t produce shots — a through ball where the striker slips, a 2-on-1 broken up at the last second — generate zero xG despite representing genuine goal threat. This means xG systematically underestimates the quality of teams that create high-danger situations but don’t get shots off.
The Poisson independence assumption fails in blowouts. When a team goes up 3-0, game dynamics change: the leading team defends deeper, the trailing team pushes forward desperately. The Poisson model assumes each team’s goals are independent, which breaks down in lopsided game states. A bivariate Poisson or Dixon-Coles adjustment (which inflates low-scoring draw probabilities) partially addresses this.
Regression timing is unpredictable. An xG overperformer will regress — but “when” is unknown. A team overperforming by +5 xG over 20 matches might continue for another 5 matches before regressing. The agent must size bets accounting for this variance. Blindly fading every overperformer immediately loses money in the short run roughly 40% of the time.
Data source inconsistencies. StatsBomb, Opta, and Understat use different xG models with different features. StatsBomb xG ≈ 0.08 for a given shot might be Opta xG ≈ 0.11 for the same shot. An agent must choose one source and stick with it — mixing xG from different providers creates systematic bias.
FAQ
What is expected goals (xG) in soccer betting?
Expected goals (xG) is the probability that a given shot results in a goal, based on features like shot distance, angle, body part, and assist type. A shot with xG = 0.12 scores roughly 12% of the time. Match-level xG sums all individual shot xG values to estimate how many goals a team “deserved” to score, providing a more predictive metric than actual goals.
How do you build an xG model with logistic regression?
An xG model uses logistic regression where the target variable is binary (goal or no goal) and features include shot distance from goal center, shot angle, body part, assist type, and game state. The formula is P(goal) = 1 / (1 + e^-(beta_0 + beta_1distance + beta_2angle + …)). Train on 10,000+ labeled shots from StatsBomb or similar data sources to get reliable coefficients.
How does xG connect to Poisson models for match prediction?
Team-level xG rates become the lambda parameters in a Poisson distribution. If the home team’s expected scoring rate is 1.65 xG and the away team’s is 1.10 xG, you compute P(home=h, away=a) = Poisson(h|1.65) * Poisson(a|1.10) for each scoreline to get win/draw/loss probabilities. This bridges shot-level analytics to match outcome betting. The full Poisson derivation is in the Poisson Distribution guide.
Why does xG outperform raw goals for predicting future results?
Goals are high-variance Poisson events. A team averaging 1.3 xG per match will score 0 goals 27% of the time and 3+ goals 14% of the time — raw goal counts are noisy over small samples. xG stabilizes much faster (within 8-10 matches) because it measures shot quality rather than binary outcomes, making it a superior predictor of future scoring.
How can betting agents exploit xG-to-goals divergence?
When a team’s actual goals significantly exceed their xG (overperformance) or fall below it (underperformance), regression toward xG-predicted levels is statistically expected. An agent can fade teams on hot shooting streaks by betting unders or against them in Asian handicap markets, and back underperforming teams whose xG suggests better results are coming. See the sharp betting hub for more regression-based strategies.
What’s Next
The xG framework converts raw shot data into match outcome probabilities — the core intelligence an agent needs for soccer betting. From here, the math extends in two directions.
- Tournament application: World Cup 2026 Betting Math adapts the xG-Poisson pipeline for tournament-specific challenges — sparse international data, group stage dynamics, and knockout round adjustments.
- The Poisson foundation: If you need a deeper dive into Poisson distributions and their properties, the Poisson Distribution in Sports Modeling guide covers everything from PMF derivation to overdispersion corrections.
- Feature engineering for models: The Feature Engineering for Sports Prediction guide covers how to select, transform, and validate features (including xG-derived features) for any sports prediction model.
- Full agent architecture: See where the xG model fits in the four-layer pipeline in the Agent Betting Stack.
- Live odds comparison: Use The Odds API pipeline to compare your xG-derived probabilities against live market odds across dozens of sportsbooks simultaneously.
