Layer 4 — Intelligence

Game Theory for Prediction Market Agents: Nash Equilibrium and Adversarial Play

Q: "How does game theory apply to prediction market trading?"

"A prediction market is an N-player imperfect-information game. Each agent has private signals (model outputs, news feeds, data sources) and chooses actions (bid, ask, hold) to maximize expected profit. Game theory provides the framework for reasoning about how other agents' strategies affect your optimal strategy — you're not betting against 'the market' in the abstract, you're competing against specific agent types with identifiable behaviors."

Q: "What is Nash equilibrium in prediction market betting?"

"Nash equilibrium in a prediction market is a set of strategies where no agent can improve its expected return by unilaterally changing its strategy. At equilibrium, each agent's bid/ask placement is the best response to every other agent's strategy. In practice, prediction markets rarely reach pure Nash equilibrium — agents continuously adapt, creating shifting equilibria that informed agents exploit."

Q: "Why do people trade in prediction markets if the no-trade theorem says they shouldn't?"

"The no-trade theorem (Milgrom-Stokey 1982) proves that rational agents with common priors shouldn't trade — any offer reveals private information, making the counterparty refuse. Real markets violate this because agents have heterogeneous priors, asymmetric information processing costs, non-financial utility from trading, hedging demand, and time-varying risk preferences. These violations create the volume that profitable agents exploit."

Q: "How do prediction market agents minimize market impact when placing large orders?"

"Agents use stealth execution strategies adapted from institutional equity trading. TWAP (Time-Weighted Average Price) splits orders into equal time slices. VWAP (Volume-Weighted Average Price) matches order size to historical volume patterns. The Almgren-Chriss framework optimally balances market impact cost against timing risk using the formula that minimizes E[cost] + λ × Var[cost], where λ is the agent's risk aversion."

Q: "How does game theory connect to the multi-armed bandit problem in betting?"

"Multi-armed bandits address single-agent exploration vs. exploitation. Game theory extends this to the multi-agent setting where your explore/exploit decisions interact with other agents' strategies. An agent using Thompson sampling on Polymarket must account for the fact that other agents are simultaneously learning and adapting — the reward distribution for each 'arm' (market) shifts as competitors enter and exit. See the multi-armed bandit guide for the single-agent foundation."

By Rahim March 21, 2026March 31, 2026 · 18 min read

Game-theoretic framework for autonomous prediction market agents: Nash equilibrium, no-trade theorem violations, information asymmetry, market impact modeling, stealth execution on Polymarket CLOB, and predatory trading detection.

Game Theory for Prediction Market Agents: Nash Equilibrium and Adversarial Play

Summary: Comprehensive guide applying game theory to autonomous prediction market and sports betting agents operating on Polymarket CLOB, Kalshi, and sportsbook platforms. Defines the prediction market as an N-player imperfect-information game where each agent selects a strategy mapping private signals to order submissions. Derives Nash equilibrium conditions: at equilibrium, no agent improves expected return by unilateral deviation. Formalizes the strategy space as σᵢ: Sᵢ → Δ(Aᵢ) where Sᵢ is agent i's signal set and Aᵢ is the action set (bid, ask, hold). Covers the no-trade theorem (Milgrom-Stokey 1982): rational agents with common priors and common knowledge of rationality should not trade, yet prediction markets sustain billions in volume — the guide identifies five specific violations enabling trade: heterogeneous priors, asymmetric information processing costs, risk-preference heterogeneity, entertainment utility, and hedging demand. Analyzes information asymmetry and adverse selection using the Kyle (1985) lambda model where price impact λ = σᵥ / σᵤ (informed order variance divided by noise trader variance). Derives optimal stealth execution strategies for large Polymarket CLOB orders: TWAP, VWAP, and Almgren-Chriss optimal execution minimizing market impact plus timing risk. Covers zero-sum vs. negative-sum dynamics (after Polymarket 2% fee and Kalshi spread). Models predatory trading where agents detect forced liquidations and front-run distressed positions. Includes Python implementation using numpy and scipy.optimize for Nash equilibrium computation via Lemke-Howson, optimal execution scheduling, and predatory trading signal detection. Maps to Layer 4 (Intelligence) of the Agent Betting Stack with connections to Layer 3 (Trading) for execution and Layer 2 (Wallet) for position tracking. References the multi-armed bandit exploration-exploitation framework for agent strategy adaptation. Part of the AgentBets Math Behind Betting series. Topics: game theory, Nash equilibrium, no-trade theorem, information asymmetry, adverse selection, market impact, stealth execution, TWAP, VWAP, Almgren-Chriss, predatory trading, Kyle lambda, zero-sum games, prediction markets, Polymarket, Kalshi, autonomous agents.

Topics: game theory, Nash equilibrium, no-trade theorem, information asymmetry, adverse selection, market impact, stealth execution, predatory trading, prediction markets, autonomous agents, Kyle lambda model, Almgren-Chriss

Stack layers: Layer 4 — Intelligence

Related tools: Polymarket CLOB, Kalshi API, py-clob-client, The Odds API

A prediction market is an N-player imperfect-information game. Nash equilibrium means no agent improves its return by unilateral deviation: σᵢ* = argmax E[πᵢ(σᵢ, σ₋ᵢ*)]. Your agent isn’t betting against “the market” — it’s competing against specific agent types (market makers, noise traders, informed speculators) whose strategies you can model, predict, and exploit.

Why This Matters for Agents

An autonomous betting agent that ignores game theory treats the market as a static environment — it estimates probabilities, compares them to prices, and trades when it sees edge. That works until it doesn’t. The moment the agent’s order flow becomes detectable, other agents adapt. Market makers widen spreads when they detect informed flow. Predatory traders front-run large positions. The agent’s edge evaporates because it failed to model the adversarial dynamics of the game it’s playing.

This is Layer 4 — Intelligence. Game theory sits at the top of the agent’s decision stack, above probability estimation and bet sizing. An agent’s Kelly-optimal bet size assumes a fixed edge, but that edge is a function of other agents’ strategies. The multi-armed bandit framework handles single-agent exploration-exploitation — game theory extends this to the multi-agent case where every participant’s strategy affects every other participant’s payoff. The output of the game-theoretic module feeds into the agent’s execution layer (Layer 3) via the Prediction Market API Reference, determining not just what to trade but how and when to trade it to minimize information leakage.

The Math

The Prediction Market as a Game

Formally, a prediction market is a tuple G = (N, {Sᵢ}, {Aᵢ}, {πᵢ}) where:

N = {1, 2, …, n} is the set of agents
Sᵢ is agent i’s signal space (private information: model outputs, news feeds, data)
Aᵢ is agent i’s action space: {bid(p, q), ask(p, q), hold}, where p is price and q is quantity
πᵢ(aᵢ, a₋ᵢ, ω) is agent i’s payoff given its action aᵢ, all other agents’ actions a₋ᵢ, and the realized outcome ω

A strategy for agent i is a mapping σᵢ: Sᵢ → Δ(Aᵢ), from private signals to a probability distribution over actions. An agent that always bids its true probability plays a pure strategy. An agent that randomizes between bidding and holding plays a mixed strategy.

Nash Equilibrium in Prediction Markets

A strategy profile (σ₁*, σ₂*, …, σₙ*) is a Nash equilibrium if no agent can improve its expected payoff by unilateral deviation:

For all i ∈ N, for all σᵢ ∈ Σᵢ:

E[πᵢ(σᵢ*, σ₋ᵢ*)] ≥ E[πᵢ(σᵢ, σ₋ᵢ*)]

In plain language: given what everyone else is doing, no agent wants to change its strategy.

What does this look like concretely? Consider a two-agent Polymarket market on “Will the Fed cut rates in June 2026?”:

Agent A: Has a macro model estimating p = 0.65
Agent B: Has a different model estimating p = 0.45

Current market price: $0.55

At Nash equilibrium, Agent A’s bid at $0.58 and Agent B’s ask at $0.52 are each best responses to each other. Agent A can’t profitably bid higher (it would overpay relative to its edge). Agent B can’t profitably ask lower (it would sell too cheaply). The equilibrium spread of $0.06 reflects the disagreement between the two agents’ models.

The No-Trade Theorem and Why It Fails

The Milgrom-Stokey (1982) no-trade theorem states: if agents have common priors and common knowledge of rationality, no trade should occur. The logic is airtight — if I offer to sell you YES at $0.55, you should ask “why is she selling?” The very act of offering reveals that I think the price should be lower, which should make you revise downward. Rational agents unravel each other’s information until no profitable trade remains.

Prediction markets trade billions in annual volume. The theorem fails because its assumptions fail. Five specific violations enable trade:

Violation	Mechanism	Example
Heterogeneous priors	Agents start with genuinely different base-rate beliefs about the world	Agent A’s LLM-based model vs. Agent B’s poll-based model for election markets
Asymmetric processing costs	Agents can’t fully extract information from prices because analysis is costly	An agent with real-time satellite data has a latency edge over one using delayed public reports
Risk-preference heterogeneity	Risk-averse agents sell to risk-neutral agents even at unfavorable prices	A market maker sells YES at $0.54 despite estimating p = 0.56 because it needs to reduce portfolio risk
Entertainment / non-financial utility	Some agents (humans) trade for fun, accepting negative EV	Retail users on Polymarket trading elections for entertainment, providing liquidity to informed agents
Hedging demand	Agents trade to offset real-world risk, not to express a probability view	A farmer buys “Drought YES” contracts as insurance, not because they think drought is underpriced

For autonomous agents, violations 1 and 2 are the primary edge sources. If your agent has a better model (violation 1) or faster data pipeline (violation 2), it extracts value from agents with worse models or slower pipelines. This is the information asymmetry game.

Information Asymmetry and the Kyle Lambda

Kyle (1985) provides the canonical model for how an informed agent trades in a market with a market maker and noise traders. The key result is the price impact coefficient λ (lambda):

λ = σᵥ / σᵤ

Where:

σᵥ = standard deviation of the informed agent’s private value estimate (how much the informed agent knows that the market doesn’t)
σᵤ = standard deviation of noise trading volume (how much uninformed flow masks the informed agent’s orders)

The price impact of an order of size x is:

ΔP = λ × x

An agent buying 1,000 YES contracts in a market with λ = 0.003 moves the price by $3.00 (from $0.55 to $0.553 per contract, aggregated). This matters enormously for sizing:

Agent's edge before impact:  p_model - p_market = 0.65 - 0.55 = $0.10
Impact cost of 1,000 units:  λ × 1,000 = $3.00 total, or $0.003 per contract
Net edge per contract:       $0.10 - $0.003 = $0.097

But at 10,000 units the impact cost is $0.03 per contract and net edge drops to $0.07. At 33,333 units, the impact cost equals the edge and the trade is worthless. The agent’s optimal order size is a function of both its edge and the market’s λ.

Market Impact as a Strategic Variable

Large agents face a fundamental tradeoff: trade fast and move the price against yourself, or trade slowly and risk the information leaking before you finish.

The Almgren-Chriss (2000) framework formalizes this. For an agent that needs to execute X total contracts over time horizon T, the optimal execution schedule minimizes:

Total Cost = Market Impact Cost + Timing Risk

E[C] + λ_risk × Var[C]

Where:

Market impact cost increases with execution speed (large orders move prices)
Timing risk increases with execution duration (prices drift while you wait)
λ_risk is the agent’s risk-aversion parameter

The optimal solution is a deterministic trajectory that front-loads execution when impact costs are low relative to timing risk. For a linear temporary impact model, the optimal execution trajectory is:

x(t) = X × sinh(κ(T - t)) / sinh(κT)

where κ = sqrt(λ_risk × σ² / η)

Variables:

x(t) = remaining position at time t
X = initial position to liquidate
σ = price volatility per unit time
η = temporary impact parameter (price moves η per unit traded, then reverts)
κ = urgency parameter balancing impact vs. risk

High κ (high risk aversion or high volatility) means trade faster — accept impact to avoid timing risk. Low κ means trade slower — accept timing risk to minimize impact.

Zero-Sum vs. Negative-Sum Dynamics

In a prediction market without fees, trading is zero-sum. Every dollar one agent wins, another agent loses. The total pool of money doesn’t change — it just redistributes.

With fees, it’s negative-sum:

Total Agent Profit = -Total Fees Paid

Polymarket: 2% on net winnings
Kalshi: built into spread, typically 2-5%
Sportsbooks: 4-10% vig

This means the average agent loses money. Profitable agents extract value from unprofitable agents at a rate that exceeds the fee drag. The math:

Profitable Agent Return = Edge - Fees - Impact Costs
                        = (p_true - p_market) × size - fee_rate × profit - λ × size

For profitability: Edge > Fees + Impact Costs

An agent with 3% edge on Polymarket (where fees are 2% on profit) needs:

Net edge after fees:    0.03 - 0.02 × 0.03 / 0.50 ≈ 0.03 - 0.0012 = 0.0288
Impact cost constraint: λ × size < 0.0288
Maximum position:       size < 0.0288 / λ

The negative-sum structure means game theory isn’t optional — it determines whether your agent is the one extracting value or the one being extracted from.

Predatory Trading

Predatory trading occurs when an agent detects another agent’s forced trading and trades against it. The classic setup:

Agent A holds a large YES position on Polymarket
A news event drops the price sharply — Agent A’s position is now deeply underwater
Agent A must sell (due to risk limits, margin calls, or stop-loss triggers)
Predatory Agent B detects Agent A’s distress and sells first, pushing the price even lower
Agent A sells into the depressed price, taking a worse exit
Agent B covers at the bottom, profiting from Agent A’s forced liquidation

The Brunnermeier-Pedersen (2005) model formalizes this. The distressed agent’s loss is amplified by:

Amplification Factor = 1 / (1 - α × n_pred)

Where:

α = fraction of distressed agent’s position that predatory agents can front-run
n_pred = number of predatory agents in the market

With α = 0.3 and 3 predatory agents, the amplification is 1 / (1 - 0.9) = 10x. The distressed agent’s loss is 10x what it would be in a market without predators.

Detection signals for predatory agents monitoring Polymarket CLOB:

Sudden large limit order cancellations (distressed agent pulling bids)
Rapid one-sided volume exceeding 3× normal (forced selling)
Orderbook imbalance ratio exceeding 5:1 bid-to-ask or ask-to-bid
Price dropping through multiple support levels without recovery

Worked Examples

Example 1: Two-Agent Nash Equilibrium on Polymarket

Two agents trade a Polymarket market on “Trump wins 2028 Republican nomination?” currently at $0.72.

Agent A: Ensemble model (polls + prediction markets + fundamentals) → p_A = 0.80
Agent B: Pure polling model → p_B = 0.68

Market maker spread: $0.71 bid / $0.73 ask
Polymarket fee: 2% on net winnings

Agent A’s optimal strategy: Buy YES at $0.73 (the ask). Expected profit per contract:

E[profit_A] = 0.80 × ($1.00 - $0.73) × (1 - 0.02) - 0.20 × $0.73
            = 0.80 × $0.27 × 0.98 - 0.20 × $0.73
            = $0.21168 - $0.146
            = $0.06568 per contract

Agent B’s optimal strategy: Sell YES at $0.71 (hit the bid). Expected profit per contract:

E[profit_B] = 0.32 × $0.71 × (1 - 0.02) - 0.68 × ($1.00 - $0.71)
            = 0.32 × $0.71 × 0.98 - 0.68 × $0.29
            = $0.22258 - $0.1972
            = $0.02538 per contract

Both agents have positive EV at these prices — this is not contradictory. They disagree on p, and the market price sits between their estimates. The Nash equilibrium has both agents trading, with the market maker capturing the spread. If Agent A’s model is correct (p = 0.80), Agent B is systematically losing money — but Agent B doesn’t know that.

Example 2: Market Impact on a Large Polymarket Position

An agent wants to buy 50,000 YES contracts on “Democrats win 2028 presidential election?” at the current price of $0.48 on Polymarket.

Current orderbook depth (asks):
$0.49: 5,000 contracts
$0.50: 8,000 contracts
$0.51: 12,000 contracts
$0.52: 15,000 contracts
$0.53: 20,000 contracts

Estimated λ = 0.0000008 (price impact per contract)

Naive execution (market order for 50,000):

5,000 × $0.49  = $2,450
8,000 × $0.50  = $4,000
12,000 × $0.51 = $6,120
15,000 × $0.52 = $7,800
10,000 × $0.53 = $5,300
────────────────────────
Total:           $25,670
VWAP:            $0.5134

vs. midpoint price of $0.48
Slippage:        $0.0334 per contract = $1,670 total

TWAP execution over 8 hours (6,250 contracts per hour):

Estimated λ_temporary = 0.000001 per contract (reverts within 30 min)
Impact per slice: 0.000001 × 6,250 = $0.00625 per contract
Total impact cost: $0.00625 × 50,000 = $312.50

vs. naive execution slippage of $1,670
Savings: $1,357.50

The TWAP approach saves $1,357.50 but takes 8 hours, during which the price could move. If volatility is σ = $0.02/hour, the timing risk over 8 hours is σ × sqrt(8) = $0.0566 per contract, or $2,830 total. The optimal strategy depends on the agent’s risk aversion — high risk aversion favors faster execution, low risk aversion favors slower execution.

Example 3: Predatory Trading Detection on BetOnline

An agent monitoring BetOnline’s NBA futures detects a pattern consistent with forced liquidation:

Market: Lakers to win 2026 NBA Championship
Pre-event line: +800 (implied 11.1%)
Rapid line movement over 12 minutes:
  +800 → +900 → +1000 → +1200 → +1400

Volume: 4.2x average for this market
Direction: 100% sell-side (NO bets / backing the field)

A predatory agent recognizes this as forced selling (a large position holder hitting their stop-loss or receiving a margin call). The predatory strategy:

1. Sell Lakers futures at +1000 (or better) — riding the momentum
2. Wait for forced selling to complete (volume normalizes)
3. Buy back at +1400 if the fundamental value hasn't changed

Profit per unit: implied prob at +1000 = 9.1%, implied prob at +1400 = 6.7%
Edge captured: 9.1% - 6.7% = 2.4% of notional

Implementation

import numpy as np
from scipy.optimize import linprog
from dataclasses import dataclass


@dataclass
class AgentStrategy:
    """Represents an agent's mixed strategy in a two-player game."""
    agent_id: str
    probabilities: np.ndarray  # probability over actions
    actions: list[str]
    expected_payoff: float


def compute_nash_equilibrium_2x2(
    payoff_a: np.ndarray,
    payoff_b: np.ndarray,
) -> tuple[AgentStrategy, AgentStrategy]:
    """
    Compute mixed-strategy Nash equilibrium for a 2x2 game.

    Args:
        payoff_a: 2x2 payoff matrix for Agent A (row player).
                  payoff_a[i][j] = A's payoff when A plays action i, B plays action j.
        payoff_b: 2x2 payoff matrix for Agent B (column player).

    Returns:
        Tuple of (AgentStrategy for A, AgentStrategy for B).
    """
    # Agent B's mixed strategy makes A indifferent between its two actions
    # A's EV from action 0: p_b * payoff_a[0,0] + (1-p_b) * payoff_a[0,1]
    # A's EV from action 1: p_b * payoff_a[1,0] + (1-p_b) * payoff_a[1,1]
    # Setting equal and solving for p_b:

    denom_b = (payoff_a[0, 0] - payoff_a[0, 1] - payoff_a[1, 0] + payoff_a[1, 1])
    if abs(denom_b) < 1e-12:
        p_b = 0.5  # degenerate case
    else:
        p_b = (payoff_a[1, 1] - payoff_a[0, 1]) / denom_b
        p_b = np.clip(p_b, 0.0, 1.0)

    denom_a = (payoff_b[0, 0] - payoff_b[1, 0] - payoff_b[0, 1] + payoff_b[1, 1])
    if abs(denom_a) < 1e-12:
        p_a = 0.5
    else:
        p_a = (payoff_b[1, 1] - payoff_b[1, 0]) / denom_a
        p_a = np.clip(p_a, 0.0, 1.0)

    ev_a = p_a * (p_b * payoff_a[0, 0] + (1 - p_b) * payoff_a[0, 1]) + \
           (1 - p_a) * (p_b * payoff_a[1, 0] + (1 - p_b) * payoff_a[1, 1])
    ev_b = p_b * (p_a * payoff_b[0, 0] + (1 - p_a) * payoff_b[1, 0]) + \
           (1 - p_b) * (p_a * payoff_b[0, 1] + (1 - p_a) * payoff_b[1, 1])

    strategy_a = AgentStrategy(
        agent_id="A",
        probabilities=np.array([p_a, 1 - p_a]),
        actions=["action_0", "action_1"],
        expected_payoff=float(ev_a),
    )
    strategy_b = AgentStrategy(
        agent_id="B",
        probabilities=np.array([p_b, 1 - p_b]),
        actions=["action_0", "action_1"],
        expected_payoff=float(ev_b),
    )

    return strategy_a, strategy_b


def kyle_lambda(
    sigma_v: float,
    sigma_u: float,
) -> float:
    """
    Compute Kyle's lambda — the price impact coefficient.

    Args:
        sigma_v: Std dev of informed trader's private value (information advantage).
        sigma_u: Std dev of noise trader volume (uninformed flow).

    Returns:
        Lambda — price impact per unit of order flow.
    """
    return sigma_v / sigma_u


def optimal_execution_almgren_chriss(
    total_shares: int,
    n_periods: int,
    sigma: float,
    eta: float,
    risk_aversion: float,
) -> np.ndarray:
    """
    Compute Almgren-Chriss optimal execution schedule.

    Minimizes E[cost] + risk_aversion * Var[cost] for liquidating a position.

    Args:
        total_shares: Total position to liquidate.
        n_periods: Number of time periods for execution.
        sigma: Price volatility per period.
        eta: Temporary impact parameter (price impact per share, reverts next period).
        risk_aversion: Risk aversion parameter (higher = faster execution).

    Returns:
        Array of shares to trade in each period (positive = sell).
    """
    kappa_sq = risk_aversion * sigma**2 / eta
    kappa = np.sqrt(kappa_sq) if kappa_sq > 0 else 1e-8

    # Remaining inventory at each period boundary
    # x(t) = X * sinh(kappa * (T - t)) / sinh(kappa * T)
    times = np.arange(n_periods + 1)
    remaining = total_shares * np.sinh(kappa * (n_periods - times)) / np.sinh(kappa * n_periods)

    # Shares to trade in each period = change in remaining inventory
    trades = -np.diff(remaining)

    return trades


def estimate_market_impact(
    order_size: int,
    orderbook_levels: list[tuple[float, int]],
) -> dict:
    """
    Estimate market impact of walking through an orderbook.

    Args:
        order_size: Number of contracts to buy.
        orderbook_levels: List of (price, quantity) tuples, sorted by price ascending.

    Returns:
        Dict with vwap, midpoint, slippage, total_cost.
    """
    remaining = order_size
    total_cost = 0.0
    fills = []

    for price, qty in orderbook_levels:
        fill_qty = min(remaining, qty)
        total_cost += fill_qty * price
        fills.append((price, fill_qty))
        remaining -= fill_qty
        if remaining <= 0:
            break

    if remaining > 0:
        return {
            "error": f"Insufficient liquidity. {remaining} contracts unfilled.",
            "filled": order_size - remaining,
            "total_cost": total_cost,
        }

    vwap = total_cost / order_size
    midpoint = orderbook_levels[0][0]  # best ask as reference
    slippage = vwap - midpoint

    return {
        "vwap": round(vwap, 6),
        "midpoint": midpoint,
        "slippage_per_contract": round(slippage, 6),
        "total_slippage": round(slippage * order_size, 2),
        "total_cost": round(total_cost, 2),
        "fills": fills,
    }


def detect_predatory_signal(
    volume_history: np.ndarray,
    price_history: np.ndarray,
    window: int = 20,
    volume_threshold: float = 3.0,
    imbalance_threshold: float = 0.8,
) -> dict:
    """
    Detect potential forced liquidation / predatory trading opportunity.

    Args:
        volume_history: Array of recent trade volumes (positive = buy, negative = sell).
        price_history: Array of recent midpoint prices (same length as volume_history).
        window: Lookback window for baseline statistics.
        volume_threshold: Multiple of average volume that triggers alert.
        imbalance_threshold: Order flow imbalance ratio (0.5 = balanced, 1.0 = fully one-sided).

    Returns:
        Dict with detection results.
    """
    if len(volume_history) < window + 5:
        return {"signal": False, "reason": "Insufficient data"}

    baseline_vol = np.mean(np.abs(volume_history[:window]))
    recent_vol = np.abs(volume_history[window:])
    recent_flow = volume_history[window:]
    recent_prices = price_history[window:]

    avg_recent_vol = np.mean(recent_vol)
    volume_multiple = avg_recent_vol / baseline_vol if baseline_vol > 0 else 0

    total_buy = np.sum(recent_flow[recent_flow > 0])
    total_sell = np.abs(np.sum(recent_flow[recent_flow < 0]))
    total_flow = total_buy + total_sell
    imbalance = max(total_buy, total_sell) / total_flow if total_flow > 0 else 0.5

    price_change = recent_prices[-1] - recent_prices[0]
    price_direction = "down" if price_change < 0 else "up"

    is_forced = (
        volume_multiple > volume_threshold
        and imbalance > imbalance_threshold
    )

    return {
        "signal": is_forced,
        "volume_multiple": round(volume_multiple, 2),
        "order_imbalance": round(imbalance, 3),
        "price_change": round(float(price_change), 4),
        "price_direction": price_direction,
        "dominant_side": "sell" if total_sell > total_buy else "buy",
        "recommendation": "Consider predatory counter-trade" if is_forced else "No signal",
    }


# --- Demo: 2x2 Game Theory Applied to Polymarket ---

if __name__ == "__main__":
    # Two agents deciding whether to buy or hold on a Polymarket market
    # Agent A: informed (has edge), Agent B: market maker
    #
    # Payoffs: [A_buy_B_buy, A_buy_B_hold]
    #          [A_hold_B_buy, A_hold_B_hold]

    payoff_a = np.array([
        [0.03, 0.08],   # A buys: less profit if B also buys (price competition), more if B holds
        [0.00, 0.00],   # A holds: zero profit regardless
    ])
    payoff_b = np.array([
        [-0.02, 0.01],  # B buys when A buys: adverse selection loss; B holds when A buys: avoids loss
        [0.04, 0.00],   # B buys when A holds: captures spread; B holds: nothing
    ])

    strat_a, strat_b = compute_nash_equilibrium_2x2(payoff_a, payoff_b)
    print("=== Nash Equilibrium: Informed Agent vs Market Maker ===")
    print(f"Agent A (informed): Buy with p={strat_a.probabilities[0]:.3f}, "
          f"Hold with p={strat_a.probabilities[1]:.3f}")
    print(f"Agent B (market maker): Buy with p={strat_b.probabilities[0]:.3f}, "
          f"Hold with p={strat_b.probabilities[1]:.3f}")
    print(f"Agent A expected payoff: ${strat_a.expected_payoff:.4f}")
    print(f"Agent B expected payoff: ${strat_b.expected_payoff:.4f}")
    print()

    # Kyle Lambda calculation
    lam = kyle_lambda(sigma_v=0.05, sigma_u=10000)
    print(f"=== Kyle Lambda ===")
    print(f"Lambda: {lam:.8f} price impact per unit")
    print(f"Impact of 5,000 contract order: ${lam * 5000:.4f} per contract")
    print()

    # Optimal execution schedule
    trades = optimal_execution_almgren_chriss(
        total_shares=50000,
        n_periods=8,
        sigma=0.02,
        eta=0.0001,
        risk_aversion=0.5,
    )
    print("=== Almgren-Chriss Optimal Execution ===")
    print(f"{'Period':<8} {'Contracts':<12} {'Cumulative %':<14}")
    cum = 0
    for i, t in enumerate(trades):
        cum += t
        print(f"{i+1:<8} {t:>10,.0f}   {cum/50000:>10.1%}")
    print()

    # Market impact estimation
    book = [(0.49, 5000), (0.50, 8000), (0.51, 12000), (0.52, 15000), (0.53, 20000)]
    impact = estimate_market_impact(50000, book)
    print("=== Market Impact: 50,000 Contract Buy ===")
    for k, v in impact.items():
        if k != "fills":
            print(f"  {k}: {v}")
    print()

    # Predatory trading detection
    np.random.seed(42)
    baseline_volume = np.random.normal(100, 30, size=20)
    spike_volume = np.random.normal(-500, 100, size=10)  # heavy selling
    volume = np.concatenate([baseline_volume, spike_volume])
    baseline_price = np.linspace(0.55, 0.54, 20)
    crash_price = np.linspace(0.54, 0.42, 10)
    prices = np.concatenate([baseline_price, crash_price])

    signal = detect_predatory_signal(volume, prices)
    print("=== Predatory Trading Signal ===")
    for k, v in signal.items():
        print(f"  {k}: {v}")

Limitations and Edge Cases

Equilibrium computation is intractable for real markets. A Polymarket market has thousands of agents with continuous action spaces. Computing exact Nash equilibria for games this large is PPAD-complete — no polynomial-time algorithm exists. The 2x2 and small-game solutions above are pedagogical. In practice, agents use heuristic best-response dynamics: estimate what others are doing, optimize against that estimate, repeat.

Agent type classification is noisy. The framework assumes you can identify agent types (market maker, informed, noise). On Polymarket’s CLOB, all you observe is order flow — you’re inferring types from behavior. A market maker that suddenly takes a directional position looks like an informed trader. An informed trader splitting orders looks like noise. Misclassification leads to wrong strategic responses.

The Kyle model assumes a single informed agent. Real prediction markets have multiple informed agents with heterogeneous information. The multi-agent extension (Kyle 1989, Foster-Viswanathan 1996) shows that competition among informed agents accelerates information incorporation — good for market efficiency, bad for any single informed agent’s profitability.

Predatory trading detection has a high false-positive rate. Sudden one-sided volume could be forced selling or a legitimate informed agent acting on new information (breaking news, data release). A predatory agent that front-runs a genuinely informed trader will lose money. The safest predatory strategies require corroborating evidence: orderbook signature (bid cancellations without price justification), position data (if available via on-chain analytics for Polymarket), and absence of fundamental news catalysts.

Game theory provides the framework, not the answer. Knowing that Nash equilibrium exists doesn’t tell you how to find it or how to exploit deviations from it. The practical value is in the mental model: your agent is playing a game against other agents, not against a passive market. Design your agent’s strategy with that adversarial framing.

FAQ

How does game theory apply to prediction market trading?

A prediction market is an N-player imperfect-information game. Each agent has private signals (model outputs, news feeds, data sources) and chooses actions (bid, ask, hold) to maximize expected profit. Game theory provides the framework for reasoning about how other agents’ strategies affect your optimal strategy — you’re not betting against “the market” in the abstract, you’re competing against specific agent types with identifiable behaviors.

What is Nash equilibrium in prediction market betting?

Nash equilibrium in a prediction market is a set of strategies where no agent can improve its expected return by unilaterally changing its strategy. At equilibrium, each agent’s bid/ask placement is the best response to every other agent’s strategy. In practice, prediction markets rarely reach pure Nash equilibrium — agents continuously adapt, creating shifting equilibria that informed agents exploit.

Why do people trade in prediction markets if the no-trade theorem says they shouldn’t?

The no-trade theorem (Milgrom-Stokey 1982) proves that rational agents with common priors shouldn’t trade — any offer reveals private information, making the counterparty refuse. Real markets violate this because agents have heterogeneous priors, asymmetric information processing costs, non-financial utility from trading, hedging demand, and time-varying risk preferences. These violations create the volume that profitable agents exploit.

How do prediction market agents minimize market impact when placing large orders?

Agents use stealth execution strategies adapted from institutional equity trading. TWAP (Time-Weighted Average Price) splits orders into equal time slices. VWAP (Volume-Weighted Average Price) matches order size to historical volume patterns. The Almgren-Chriss framework optimally balances market impact cost against timing risk using the formula that minimizes E[cost] + λ × Var[cost], where λ is the agent’s risk aversion.

How does game theory connect to the multi-armed bandit problem in betting?

Multi-armed bandits address single-agent exploration vs. exploitation. Game theory extends this to the multi-agent setting where your explore/exploit decisions interact with other agents’ strategies. An agent using Thompson sampling on Polymarket must account for the fact that other agents are simultaneously learning and adapting — the reward distribution for each “arm” (market) shifts as competitors enter and exit. See the multi-armed bandit guide for the single-agent foundation.

What’s Next

This guide gives your agent the adversarial mindset it needs to compete in multi-agent prediction markets. The natural next steps:

Execution timing: Reinforcement Learning for Dynamic Bet Timing — how agents learn optimal execution policies through experience rather than closed-form solutions.
Exploration-exploitation foundation: Multi-Armed Bandit Problems for Betting Agents — the single-agent framework that game theory extends to multi-agent settings.
Model evaluation: Calibration and Model Evaluation — how agents verify that the probability estimates feeding their game-theoretic strategies are accurate.
The full agent architecture: Agent Betting Stack — where game theory (Layer 4 Intelligence) fits in the four-layer agent framework.
Execution infrastructure: See Polymarket CLOB API Guide for implementing the stealth execution strategies discussed above.
Sharp betting context: The sharp betting hub covers the broader ecosystem of informed agent strategies.

Frequently Asked Questions

How does game theory apply to prediction market trading?

A prediction market is an N-player imperfect-information game. Each agent has private signals (model outputs, news feeds, data sources) and chooses actions (bid, ask, hold) to maximize expected profit. Game theory provides the framework for reasoning about how other agents' strategies affect your optimal strategy — you're not betting against 'the market' in the abstract, you're competing against specific agent types with identifiable behaviors.

What is Nash equilibrium in prediction market betting?

Nash equilibrium in a prediction market is a set of strategies where no agent can improve its expected return by unilaterally changing its strategy. At equilibrium, each agent's bid/ask placement is the best response to every other agent's strategy. In practice, prediction markets rarely reach pure Nash equilibrium — agents continuously adapt, creating shifting equilibria that informed agents exploit.

Why do people trade in prediction markets if the no-trade theorem says they shouldn't?

The no-trade theorem (Milgrom-Stokey 1982) proves that rational agents with common priors shouldn't trade — any offer reveals private information, making the counterparty refuse. Real markets violate this because agents have heterogeneous priors, asymmetric information processing costs, non-financial utility from trading, hedging demand, and time-varying risk preferences. These violations create the volume that profitable agents exploit.

How do prediction market agents minimize market impact when placing large orders?

How does game theory connect to the multi-armed bandit problem in betting?

Multi-armed bandits address single-agent exploration vs. exploitation. Game theory extends this to the multi-agent setting where your explore/exploit decisions interact with other agents' strategies. An agent using Thompson sampling on Polymarket must account for the fact that other agents are simultaneously learning and adapting — the reward distribution for each 'arm' (market) shifts as competitors enter and exit. See the multi-armed bandit guide for the single-agent foundation.