Sportsbooks price parlay legs as if they’re independent — but many legs are correlated. The true joint probability is P(A and B) = P(A) * P(B) + rho * sqrt(Var(A) * Var(B)). When your correlation estimate is more accurate than the book’s, you have edge. Same-game parlays are the largest mispricing surface in modern sports betting.

Why This Matters for Agents

Parlays are the fastest-growing product in sports betting. DraftKings reported that same-game parlays account for over 30% of handle on NFL Sundays. Sportsbooks love them because the hold percentage is 2-5x higher than straight bets. Bettors love them because the payouts look massive.

The math behind parlay pricing hinges on one assumption: the legs are independent. An autonomous agent operating at Layer 4 — Intelligence in the Agent Betting Stack exploits the gap between that assumption and reality. The agent estimates pairwise and higher-order correlations from historical data, computes the true joint probability of a multi-leg bet, compares it against the sportsbook’s offered price, and bets only when the book’s correlation model is less accurate than its own. This is the same correlation-modeling pipeline that feeds cross-market portfolio construction — see Correlation and Portfolio Theory for Multi-Market Agents for the portfolio-level math.

The Math

The Naive Parlay Formula

A standard parlay pays as if you reinvested each leg’s winnings into the next. For n legs with decimal odds d_1, d_2, …, d_n:

Parlay Payout = Stake × Product(d_i) for i = 1 to n

For a $100 two-leg parlay at -110 (decimal 1.909) on each leg:

Payout = $100 × 1.909 × 1.909 = $364.55

The implied probability of hitting the parlay is the product of individual implied probabilities:

P(parlay) = Product(p_i) = 0.524 × 0.524 = 0.2745 (27.45%)

This assumes statistical independence — the outcome of Leg 1 has zero effect on the probability of Leg 2. For bets on different games, different sports, or different days, independence is reasonable. For legs within the same game, it is not.

Why Independence Fails

Two binary events A and B are independent if and only if P(A and B) = P(A) × P(B). When they’re correlated, the true joint probability is:

P(A and B) = P(A) × P(B) + Cov(A, B)

where the covariance between two Bernoulli variables is:

Cov(A, B) = rho × sqrt(P(A)(1 - P(A)) × P(B)(1 - P(B)))

rho is the Pearson correlation coefficient, ranging from -1 to +1. When rho = 0, independence holds and the naive formula is correct. When rho != 0, the naive formula is wrong.

For a concrete example: P(A) = 0.55, P(B) = 0.60, rho = 0.40.

Cov(A, B) = 0.40 × sqrt(0.55 × 0.45 × 0.60 × 0.40)
           = 0.40 × sqrt(0.0594)
           = 0.40 × 0.2437
           = 0.0975

P(A and B) = 0.55 × 0.60 + 0.0975 = 0.3300 + 0.0975 = 0.4275

The naive estimate is 0.33 (33%). The true probability is 0.4275 (42.75%). That is a 29.5% relative error — the parlay hits far more often than the independent model predicts.

Positive vs. Negative Correlation

Positively correlated legs (rho > 0) — the parlay hits more often than independence predicts:

Leg ALeg BTypical rhoWhy
QB passing yards overTeam total over+0.45 to +0.65More passing yards → more scoring
Team moneylineTeam total over+0.30 to +0.50Winning teams tend to score more
RB rushing yards overTeam moneyline+0.20 to +0.35Rushing success correlates with winning
WR receiving yards overQB passing yards over+0.50 to +0.70WR1 is a large share of QB output

Negatively correlated legs (rho < 0) — the parlay hits less often than independence predicts:

Leg ALeg BTypical rhoWhy
Game total underWinning margin over 14.5-0.25 to -0.40Blowouts require scoring
QB passing yards overGame total under-0.30 to -0.45High passing volume means high scoring
Team A moneylineTeam B player props over-0.15 to -0.30Losing teams play differently in garbage time

The SGP Pricing Problem

Sportsbooks face a hard problem with same-game parlays: they need to estimate correlations between thousands of prop combinations in real time. Their approaches vary:

Method 1: Blanket correlation tax. Apply a flat 10-30% reduction to the parlay payout regardless of actual leg correlation. This is crude but fast. It overpenalizes weakly correlated legs and underpenalizes strongly correlated legs.

Method 2: Lookup tables. Pre-compute correlation buckets (low/medium/high) for common prop categories. More accurate than a blanket tax, but still coarse.

Method 3: Real-time simulation. Run Monte Carlo simulations for each SGP combination using game-level models. The most accurate approach, but computationally expensive — which is why SGP odds sometimes take 5-10 seconds to load.

The exploitable gap: books using Method 1 or Method 2 systematically misprice legs whose actual correlation falls outside their assumed bucket. An agent using Method 3 (or better, a copula-based model trained on historical data) finds edges that blanket pricing creates.

Generalizing to N Legs with Copulas

For parlays with more than two legs, pairwise correlations aren’t sufficient — you need the full joint distribution. The standard approach is a Gaussian copula.

The Gaussian copula models the joint distribution of n Bernoulli outcomes by:

  1. Mapping each marginal probability p_i to a standard normal quantile z_i = Phi_inverse(p_i)
  2. Defining a correlation matrix R of the latent normal variables
  3. Computing the joint probability as the probability that all latent variables exceed their respective thresholds
P(all legs hit) = P(Z_1 > z_1, Z_2 > z_2, ..., Z_n > z_n)

where (Z_1, …, Z_n) ~ N(0, R) and z_i = Phi_inverse(1 - p_i).

This is computed numerically via multivariate normal CDF integration — scipy.stats.multivariate_normal handles it directly.

Worked Examples

Example 1: NFL Same-Game Parlay on DraftKings

Consider a Chiefs vs. Raiders game with these SGP legs on DraftKings:

Leg 1: Patrick Mahomes Over 275.5 passing yards  → -120 (implied 54.5%)
Leg 2: Chiefs moneyline                          → -210 (implied 67.7%)
Leg 3: Game total Over 47.5                      → -105 (implied 51.2%)

DraftKings offers a 3-leg SGP payout at +320 (implied 23.8%).

The naive independence calculation:

P(all three) = 0.545 × 0.677 × 0.512 = 0.1888 (18.88%)

The naive implied probability (18.88%) is lower than what DraftKings shows (23.8%) — this means DraftKings has already applied a correlation tax, since they’re offering worse odds than independence would produce. But is the tax calibrated correctly?

Correlation estimates from 2021-2025 NFL data:

rho(Mahomes passing over, Chiefs ML)     = +0.42
rho(Mahomes passing over, game over)     = +0.55
rho(Chiefs ML, game over)               = +0.33

Using a Gaussian copula with these correlations, the true joint probability is approximately 0.261 (26.1%).

True probability:     26.1%
DraftKings implied:   23.8%
Edge:                 26.1% - 23.8% = +2.3 percentage points
EV per dollar:        0.261 × (1 + 3.20) - 1 = 0.261 × 4.20 - 1 = +0.096 (+9.6%)

The SGP is +EV. DraftKings’s correlation tax undershoots the true correlation between these legs.

Example 2: Negatively Correlated Trap

A bettor builds this parlay on BetOnline:

Leg 1: Game total Under 44.5    → -110 (implied 52.4%)
Leg 2: Chiefs -14.5 spread      → +200 (implied 33.3%)

BetOnline offers +450 (implied 18.2%).

Naive independence: 0.524 × 0.333 = 0.1745 (17.45%).

These legs are negatively correlated. A blowout (covering -14.5) requires the winning team to score heavily, which pushes the total higher. Historical rho estimate: -0.35.

Cov = -0.35 × sqrt(0.524 × 0.476 × 0.333 × 0.667) = -0.35 × 0.2879 = -0.1008

P(both) = 0.1745 + (-0.1008) = 0.0737 (7.37%)

True probability is 7.37%. BetOnline implies 18.2%. This parlay is massively -EV:

EV = 0.0737 × 5.50 - 1 = -0.595 (-59.5%)

The book isn’t adjusting for the negative correlation at all — the bettor is getting destroyed.

Implementation

import numpy as np
from scipy import stats
from scipy.stats import multivariate_normal
from dataclasses import dataclass


@dataclass
class ParlayLeg:
    """Single leg of a parlay with its marginal probability."""
    name: str
    implied_prob: float
    decimal_odds: float


@dataclass
class CorrelatedParlayResult:
    """Result of correlated parlay EV analysis."""
    true_joint_prob: float
    naive_joint_prob: float
    offered_implied_prob: float
    offered_decimal_odds: float
    ev_per_dollar: float
    correlation_edge: float  # true - naive probability


def american_to_decimal(american: int) -> float:
    """Convert American odds to decimal odds."""
    if american < 0:
        return 1 + 100 / abs(american)
    return 1 + american / 100


def american_to_prob(american: int) -> float:
    """Convert American odds to implied probability (no vig removal)."""
    if american < 0:
        return abs(american) / (abs(american) + 100)
    return 100 / (american + 100)


def pairwise_correlated_prob(p_a: float, p_b: float, rho: float) -> float:
    """
    Compute P(A and B) for two correlated Bernoulli variables.

    P(A and B) = P(A)*P(B) + rho * sqrt(P(A)(1-P(A)) * P(B)(1-P(B)))

    Args:
        p_a: marginal probability of A
        p_b: marginal probability of B
        rho: Pearson correlation coefficient between A and B

    Returns:
        Joint probability P(A and B)
    """
    cov = rho * np.sqrt(p_a * (1 - p_a) * p_b * (1 - p_b))
    joint = p_a * p_b + cov
    return np.clip(joint, 0.0, min(p_a, p_b))


def copula_joint_prob(
    marginal_probs: list[float],
    correlation_matrix: np.ndarray
) -> float:
    """
    Compute joint probability of all legs hitting using a Gaussian copula.

    Maps each marginal to a latent normal variable, then computes the
    probability that all latent variables exceed their thresholds under
    the specified correlation structure.

    Args:
        marginal_probs: list of marginal probabilities for each leg
        correlation_matrix: n x n correlation matrix of latent normals

    Returns:
        Joint probability P(all legs hit)
    """
    n = len(marginal_probs)

    # Map marginal probs to normal thresholds
    # P(leg hits) = P(Z > threshold) = 1 - Phi(threshold)
    # threshold = Phi_inverse(1 - p)
    thresholds = [stats.norm.ppf(1 - p) for p in marginal_probs]

    # P(all hit) = P(Z_1 > t_1, ..., Z_n > t_n) where Z ~ N(0, R)
    # = P(-Z_1 < -t_1, ..., -Z_n < -t_n)  (negate)
    # = multivariate_normal CDF evaluated at (-t_1, ..., -t_n)
    upper = [-t for t in thresholds]

    mvn = multivariate_normal(mean=np.zeros(n), cov=correlation_matrix)

    # Use Monte Carlo integration for the CDF
    n_samples = 500_000
    samples = mvn.rvs(size=n_samples, random_state=42)
    if n == 1:
        samples = samples.reshape(-1, 1)

    hits = np.all(samples < np.array(upper), axis=1)
    joint_prob = np.mean(hits)

    return joint_prob


def evaluate_correlated_parlay(
    legs: list[ParlayLeg],
    correlation_matrix: np.ndarray,
    offered_decimal_odds: float
) -> CorrelatedParlayResult:
    """
    Full evaluation of a correlated parlay's expected value.

    Args:
        legs: list of ParlayLeg objects
        correlation_matrix: pairwise correlation matrix (n x n)
        offered_decimal_odds: the sportsbook's offered parlay decimal odds

    Returns:
        CorrelatedParlayResult with EV analysis
    """
    marginal_probs = [leg.implied_prob for leg in legs]

    # Naive (independence) probability
    naive_prob = np.prod(marginal_probs)

    # True (correlated) probability via Gaussian copula
    true_prob = copula_joint_prob(marginal_probs, correlation_matrix)

    # Offered implied probability
    offered_implied = 1.0 / offered_decimal_odds

    # EV calculation
    ev = true_prob * offered_decimal_odds - 1.0

    return CorrelatedParlayResult(
        true_joint_prob=true_prob,
        naive_joint_prob=naive_prob,
        offered_implied_prob=offered_implied,
        offered_decimal_odds=offered_decimal_odds,
        ev_per_dollar=ev,
        correlation_edge=true_prob - naive_prob
    )


def estimate_correlation_from_data(
    outcomes_a: np.ndarray,
    outcomes_b: np.ndarray
) -> tuple[float, float]:
    """
    Estimate Pearson correlation between two binary outcome series.

    Args:
        outcomes_a: array of 0/1 outcomes for event A
        outcomes_b: array of 0/1 outcomes for event B

    Returns:
        (rho, p_value) — correlation estimate and statistical significance
    """
    rho, p_value = stats.pearsonr(outcomes_a, outcomes_b)
    return rho, p_value


def kelly_correlated_parlay(
    true_prob: float,
    decimal_odds: float,
    bankroll: float,
    fraction: float = 0.25
) -> float:
    """
    Fractional Kelly sizing for a single correlated parlay.

    Uses standard Kelly on the parlay as a single bet:
    f* = (b*p - q) / b, where b = decimal_odds - 1, p = true_prob

    Args:
        true_prob: estimated true probability of all legs hitting
        decimal_odds: offered parlay decimal odds
        bankroll: current bankroll
        fraction: Kelly fraction (0.25 = quarter Kelly)

    Returns:
        Optimal bet size in dollars
    """
    b = decimal_odds - 1
    q = 1 - true_prob

    kelly_fraction = (b * true_prob - q) / b

    if kelly_fraction <= 0:
        return 0.0

    bet_size = bankroll * kelly_fraction * fraction
    return round(bet_size, 2)


# ── Worked example: Chiefs SGP ──────────────────────────────────────
if __name__ == "__main__":
    legs = [
        ParlayLeg("Mahomes O275.5 pass yds", american_to_prob(-120), american_to_decimal(-120)),
        ParlayLeg("Chiefs ML", american_to_prob(-210), american_to_decimal(-210)),
        ParlayLeg("Over 47.5 total", american_to_prob(-105), american_to_decimal(-105)),
    ]

    # Correlation matrix estimated from 2021-2025 NFL data
    corr = np.array([
        [1.00, 0.42, 0.55],
        [0.42, 1.00, 0.33],
        [0.55, 0.33, 1.00],
    ])

    # DraftKings offered +320 → decimal 4.20
    offered_odds = american_to_decimal(320)

    result = evaluate_correlated_parlay(legs, corr, offered_odds)

    print("=== Correlated Parlay Analysis ===")
    print(f"Legs: {[l.name for l in legs]}")
    print(f"Marginal probs: {[f'{l.implied_prob:.1%}' for l in legs]}")
    print()
    print(f"Naive (independent) prob:  {result.naive_joint_prob:.4f} ({result.naive_joint_prob:.1%})")
    print(f"True (correlated) prob:    {result.true_joint_prob:.4f} ({result.true_joint_prob:.1%})")
    print(f"Correlation edge:          {result.correlation_edge:+.4f} ({result.correlation_edge:+.1%})")
    print()
    print(f"Offered odds:              +{int((offered_odds - 1) * 100)} (decimal {offered_odds:.2f})")
    print(f"Offered implied prob:      {result.offered_implied_prob:.1%}")
    print(f"EV per dollar:             {result.ev_per_dollar:+.4f} ({result.ev_per_dollar:+.1%})")
    print()

    bankroll = 10_000
    bet = kelly_correlated_parlay(result.true_joint_prob, offered_odds, bankroll, fraction=0.25)
    print(f"Quarter-Kelly bet size:    ${bet:.2f} on ${bankroll:,} bankroll")

Limitations and Edge Cases

Correlation instability. Pairwise correlations between player props shift with game context. A correlation estimated from 16 regular-season games may not hold in a playoff game with a different pace, game script, or weather condition. Agents should weight recent games more heavily and maintain rolling correlation windows.

Sample size. Estimating correlation from binary outcomes requires large samples. With n = 50 games, the standard error on a Pearson r of 0.40 is approximately 0.13 — the 95% confidence interval spans 0.15 to 0.62. An agent that treats its point estimate as ground truth will overfit. Use Bayesian shrinkage toward zero for small samples.

Higher-order dependencies. The Gaussian copula captures pairwise linear correlations but misses higher-order dependencies. If three legs are jointly correlated in a way that pairwise correlations don’t capture (e.g., a specific game script triggers all three), the copula underestimates or overestimates the true joint probability. Vine copulas or simulation-based approaches handle this better but are computationally heavier.

Copula misspecification. The Gaussian copula assumes the dependence structure follows a multivariate normal distribution. Tail dependence — the tendency for extreme outcomes to co-occur — is poorly modeled by the Gaussian copula. The Clayton copula captures lower-tail dependence; the Gumbel copula captures upper-tail dependence. For most sports betting correlations, the Gaussian is adequate, but agents operating in extreme-outcome markets (e.g., large underdogs in every leg) should consider alternatives.

Sportsbook adjustment drift. Books continuously improve their SGP correlation models. An edge that existed in 2023 SGP pricing may be closed by 2026. Agents need to monitor their actual hit rate versus predicted hit rate and recalibrate when the discrepancy narrows.

Odds movement between legs. Placing one leg of a parlay can move the line for correlated legs. An agent placing a large bet on Mahomes passing yards over may see the team total line adjust before it places the second leg. Execution speed matters — this is where Layer 3 trading infrastructure intersects with Layer 4 intelligence.

FAQ

How do you calculate the true probability of a correlated parlay?

The true joint probability of a correlated parlay is P(A and B) = P(A) * P(B) + rho * sqrt(P(A)(1-P(A)) * P(B)(1-P(B))), where rho is the Pearson correlation coefficient between the two outcomes. For more than two legs, use a Gaussian copula to model the full joint distribution. The naive formula P(A) * P(B) only works when rho = 0.

Why are same-game parlays profitable for sportsbooks?

Sportsbooks apply a correlation tax of 10-30% to SGP payouts, reducing the offered odds below what independence would imply. This blanket penalty overcompensates on weakly correlated legs (like a kicker’s field goals and the opposing QB’s passing yards) while sometimes undercompensating on strongly correlated legs (like a QB’s passing touchdowns and team total). The asymmetry creates exploitable edges for agents with accurate correlation models.

What is the correlation between QB passing yards and team points in the NFL?

Historical NFL data from 2019-2024 shows a Pearson correlation of approximately 0.45-0.65 between a quarterback’s passing yards and their team’s total points scored. The exact value varies by quarterback — high-volume passers like Patrick Mahomes show stronger correlation than run-heavy offenses. This correlation means a parlay combining QB passing yards over with team total over has a higher true probability than independence would suggest.

How does correlation affect parlay expected value?

Positive correlation increases the true joint probability above the independence assumption, meaning the parlay hits more often than naive math predicts. If a sportsbook prices the parlay assuming independence, the bettor gets +EV. Negative correlation decreases the true probability, making the parlay worse than it appears. The key metric is whether the sportsbook’s correlation adjustment matches reality — mispricing in either direction creates edge.

How does portfolio theory connect to parlay correlation risk?

Parlays are concentrated bets — all legs must hit. Portfolio theory shows that diversification reduces variance. An agent that spreads capital across independent single bets has lower variance than one that concentrates it in correlated parlays. The optimal strategy uses parlays only when correlation mispricing creates +EV, and sizes them using modified Kelly that accounts for correlated outcomes.

What’s Next

Correlation in parlays is a special case of the broader correlation problem in betting portfolios. The next steps: