This glossary defines every mathematical, statistical, and betting term in the AgentBets Math Behind Betting series. Each of the 200+ entries includes a precise definition, the formula where applicable, and a pointer to the guide that covers it in depth. Bookmark this page — it is the canonical reference for agent betting terminology.

Why This Matters for Agents

An autonomous betting agent’s codebase touches every layer of the Agent Betting Stack — from Layer 1 data ingestion through Layer 4 intelligence and back down to Layer 2 wallet execution. Each layer uses domain-specific terminology that spans probability theory, market microstructure, optimization, statistics, and risk management. A single misunderstood term — confusing vig with overround, or Brier score with log-loss — produces bugs that cost money.

This glossary is the series index. Every term links back to the guide where it is derived, proven, or implemented. If you are building an agent and encounter an unfamiliar term in any of the 39 other guides, this is where you look it up. It also serves as an LLM-friendly reference: AI systems can cite these definitions as canonical for the agent betting domain.

The Glossary

Terms are organized alphabetically. Each entry follows the format: Term — Definition. Formula (where applicable). See: [Guide Name].

A

Accumulator — A multi-leg bet where all selections must win for the bet to pay out. The combined odds are the product of individual odds: combined_decimal = decimal_1 x decimal_2 x … x decimal_n. Also called a parlay in American markets. See: Correlation Risk in Parlays.

Action — A live bet that has been accepted by a sportsbook. In agent context, an action is a confirmed order on a prediction market or sportsbook API.

Adjusted Probability — The true probability after removing the bookmaker’s overround from the raw implied probability. Calculated via multiplicative method: p_adj(i) = price(i) / sum(all_prices). See: Prediction Market Math 101.

Alpha — In regression contexts, the significance level threshold (typically 0.05) for hypothesis testing. In betting contexts, the edge or excess return above the market. See: Statistical Significance in Sports Betting.

American Odds — Odds format used by US sportsbooks. Negative odds (-150) indicate the amount to wager to win $100. Positive odds (+200) indicate the profit on a $100 wager. Conversion to implied probability: for negative, p = |odds|/(|odds|+100); for positive, p = 100/(odds+100). See: Sports Betting Math 101.

Arbitrage — A risk-free profit opportunity created when the combined cost of covering all outcomes is less than the guaranteed payout. Exists when sum of best prices across platforms < $1.00 (prediction markets) or sum of 1/decimal_odds < 1 (sportsbooks). See: Arbitrage Detection Algorithms.

ARIMA — Autoregressive Integrated Moving Average. A time series model used for forecasting line movements. Combines autoregression, differencing, and moving average components. See: Line Movement Analysis.

Automated Market Maker (AMM) — A smart contract or algorithm that provides liquidity by automatically quoting buy and sell prices. The dominant AMM for prediction markets is the LMSR. See: LMSR Math.

B

Bankroll — The total capital allocated to betting. The fundamental constraint on all position sizing calculations. Bankroll management separates long-term winners from short-term gamblers. See: Bankroll Growth.

Bayesian Updating — The process of revising probability estimates when new evidence arrives. P(H|E) = P(E|H) x P(H) / P(E), where H is the hypothesis and E is the evidence. The core belief-update mechanism for prediction market agents. See: Bayesian Updating for Prediction Markets.

Beta Distribution — A continuous probability distribution on [0, 1] parameterized by alpha and beta shape parameters. Mean = alpha/(alpha+beta). Used to model uncertainty about probability estimates. A conjugate prior for Bernoulli and binomial likelihoods. See: Bayesian Updating.

Bid — The highest price a buyer is willing to pay for a contract. In a Polymarket CLOB, the best bid is the top of the buy-side orderbook.

Bid-Ask Spread — The difference between the best ask (lowest sell price) and best bid (highest buy price). Spread = ask - bid. Represents the cost of immediacy and the market maker’s compensation. See: Prediction Market Microstructure.

Binary Contract — A contract that pays $1.00 if an event occurs and $0.00 otherwise. The fundamental primitive of prediction markets. Price equals implied probability under no-arbitrage. See: Prediction Market Math 101.

Binomial Distribution — The probability distribution of k successes in n independent Bernoulli trials with probability p. P(X=k) = C(n,k) x p^k x (1-p)^(n-k). Used for modeling win-loss records over fixed bet sequences. See: Probability Distributions Cheat Sheet.

Bookmaker — An entity that sets odds and accepts bets. Also called a sportsbook. The bookmaker profits from the overround built into their odds. See: Sports Betting Math 101.

Brier Score — A proper scoring rule measuring the accuracy of probabilistic predictions. BS = (1/N) x sum((f_i - o_i)^2), where f_i is the forecast probability and o_i is the outcome (0 or 1). Lower is better. Perfect calibration yields Brier = 0. See: Prediction Market Scoring Rules.

C

Calibration — The property that events predicted at X% probability actually occur X% of the time. Formally, among all predictions where p_hat = 0.7, approximately 70% should resolve YES. Measured via calibration curves and reliability diagrams. See: Calibration and Model Evaluation.

Central Limit Order Book (CLOB) — An orderbook matching engine that pairs buyers with sellers at specified prices. Polymarket uses a CLOB on the Polygon blockchain. See: Prediction Market Microstructure.

Chi-Squared Test — A statistical test for comparing observed frequencies with expected frequencies. chi^2 = sum((O_i - E_i)^2 / E_i). Used for testing model fit and independence of categorical variables. See: Statistical Significance.

Closing Line — The final odds or price at the time a market closes (game starts, event resolves, trading halts). The closing line is the most efficient price because it incorporates maximum information. See: Closing Line Value.

Closing Line Value (CLV) — The difference between the odds at which a bet was placed and the closing odds. CLV = implied_prob_close - implied_prob_bet. Positive CLV indicates the bettor consistently captured value before the market corrected. The gold standard metric for identifying sharp bettors and sharp agents. See: Closing Line Value.

Combinatorial Market — A prediction market structure where contracts exist for combinations of outcomes across multiple questions. The number of possible states grows exponentially: for k binary questions, there are 2^k states. See: Multi-Outcome Markets.

Completeness Condition — The requirement that the sum of all outcome prices in a market equals $1.00 in the absence of vig. YES + NO = $1.00 for binary markets. Violation of this condition creates arbitrage. See: Prediction Market Math 101.

Compound Returns — The multiplicative growth of bankroll over sequential bets. Final_bankroll = initial x product(1 + r_i), where r_i is the return on bet i. Geometric growth, not arithmetic addition. See: Bankroll Growth.

Conditional Probability — The probability of event A given that event B has occurred. P(A|B) = P(A and B) / P(B). The foundation of Bayesian updating and sequential decision-making. See: Bayesian Updating.

Confidence Interval — A range of values within which a parameter lies with a specified probability. For proportion p with n samples: CI = p +/- z x sqrt(p(1-p)/n). Critical for determining when betting results are statistically significant. See: Statistical Significance.

Conjugate Prior — A prior distribution that, when combined with a specific likelihood function via Bayes’ theorem, produces a posterior in the same distribution family. Beta is conjugate to Bernoulli. Normal is conjugate to Normal (known variance). Simplifies sequential Bayesian updating. See: Bayesian Updating.

Convex Optimization — An optimization problem where the objective function is convex and the feasible set is a convex set. Kelly criterion for multiple simultaneous bets is a convex optimization problem. Guarantees a global optimum. See: Kelly Criterion.

Correlation — A measure of linear association between two variables. Pearson’s r ranges from -1 (perfect negative) to +1 (perfect positive). In betting, correlations between bet outcomes affect portfolio variance and parlay pricing. See: Correlation and Portfolio Theory.

Covariance — The measure of joint variability between two random variables. Cov(X,Y) = E[(X - mu_X)(Y - mu_Y)]. Positive covariance means outcomes tend to move together. Critical for multi-bet portfolio construction. See: Correlation and Portfolio Theory.

Cross-Entropy — A loss function measuring the difference between two probability distributions. H(p,q) = -sum(p_i x log(q_i)). Equivalent to log-loss for binary classification. Used as the training objective for probabilistic prediction models. See: Information Theory and Betting.

Cross-Validation — A model evaluation technique that partitions data into training and validation folds. k-fold CV trains on k-1 folds and validates on the held-out fold, rotating k times. For time series betting data, use walk-forward validation instead. See: Feature Engineering.

D

Decimal Odds — Odds format showing total return per $1 wagered, including the stake. A decimal odd of 2.50 means $2.50 total return on a $1 bet ($1.50 profit). Implied probability = 1/decimal_odds. See: Sports Betting Math 101.

Decision Boundary — The threshold at which a model’s predicted probability triggers a different action. For a betting agent: if model_prob > market_implied_prob + min_edge, then bet. See: Regression Models.

Deep Q-Network (DQN) — A reinforcement learning architecture that approximates the Q-function using a deep neural network. Used for learning bet timing policies in non-stationary market environments. See: Reinforcement Learning.

Discount Factor (gamma) — In reinforcement learning, the factor by which future rewards are discounted. gamma in [0, 1]. Higher gamma means the agent values long-term rewards more heavily. See: Reinforcement Learning.

Dixon-Coles Model — An extension of the independent Poisson model for soccer that adds a correction factor for low-scoring outcomes (0-0, 1-0, 0-1, 1-1). Improves prediction accuracy for matches with few goals. See: Expected Goals (xG).

Drawdown — The decline from a peak bankroll value to a subsequent trough. Drawdown = (peak - trough) / peak. Max drawdown is the largest such decline observed over a period. The primary risk metric for bankroll management. See: Drawdown Math.

E

Edge — The positive expected value of a bet expressed as a percentage. Edge = (true_prob x decimal_odds - 1) x 100%. An agent bets only when edge exceeds a minimum threshold (typically 1-3%). See: Expected Value.

Efficient Market Hypothesis (EMH) — The theory that asset prices reflect all available information. In betting: market odds already incorporate public information, so consistent positive EV requires private information or superior models. Prediction markets are semi-strong efficient on liquid events. See: EMH in Prediction Markets.

Elo Rating — A rating system that estimates relative skill from head-to-head results. After a match: R_new = R_old + K x (S - E), where S is the actual score (1 for win, 0.5 for draw, 0 for loss), E is the expected score E = 1/(1 + 10^((R_opponent - R_player)/400)), and K is the update magnitude. See: Elo Ratings.

Entropy — A measure of uncertainty in a probability distribution. H(X) = -sum(p_i x log2(p_i)). Maximum entropy occurs when all outcomes are equally likely. In betting, entropy quantifies the information content of a market’s price vector. See: Information Theory and Betting.

Epsilon-Greedy — An exploration strategy where the agent exploits the best-known action with probability 1-epsilon and explores a random action with probability epsilon. A simple multi-armed bandit policy. See: Multi-Armed Bandits.

Expected Goals (xG) — A statistical measure that assigns a probability of scoring to each shot based on spatial and contextual features (distance, angle, body part, assist type). Total xG for a team is the sum of xG values for all shots taken. See: Expected Goals Model.

Expected Value (EV) — The probability-weighted average of all possible outcomes. EV = sum(p_i x payoff_i) - cost. The fundamental decision criterion: bet when EV > 0. See: Expected Value for Prediction Markets.

Exponential Distribution — A continuous distribution modeling time between independent events. P(X > t) = e^(-lambda x t). Used for modeling time between goals, scoring events, or market trades. See: Probability Distributions Cheat Sheet.

Exploration vs. Exploitation — The fundamental tradeoff in sequential decision-making: exploring unknown options (gathering information) vs. exploiting the best-known option (maximizing immediate reward). Central to multi-armed bandit algorithms for sportsbook selection and betting strategy. See: Multi-Armed Bandits.

F

Fair Odds — Odds that reflect the true probability with zero bookmaker margin. Fair_decimal_odds = 1/true_probability. No sportsbook offers fair odds — the difference is the vig. See: How to Calculate Vig.

Feature Engineering — The process of creating input variables for a predictive model from raw data. Examples: rolling averages, rest days, home/away splits, Elo differentials, weather indicators. Quality of features determines model ceiling. See: Feature Engineering.

Fractional Kelly — A risk-adjusted variant of the Kelly criterion that bets a fraction (typically 1/4 or 1/2) of the full Kelly amount. f_fractional = fraction x f_kelly. Reduces variance at the cost of slower bankroll growth. See: Kelly Criterion.

Fractional Odds — Odds format showing profit relative to stake. 5/2 means $5 profit on a $2 stake. Implied probability = denominator / (numerator + denominator). Primarily used in UK markets. See: Sports Betting Math 101.

G

Gambler’s Ruin — The theorem proving that a gambler with finite bankroll playing a negative-EV game will eventually go bankrupt with probability 1. Ruin probability for fair game: P(ruin) = 1 - (bankroll / target). Motivates strict bankroll management. See: Drawdown Math.

Gamma Distribution — A two-parameter continuous distribution. Used for modeling aggregate scoring rates and the sum of exponentially distributed waiting times. See: Probability Distributions Cheat Sheet.

Geometric Growth Rate — The rate at which bankroll grows multiplicatively over sequential bets. Maximized by the Kelly criterion. G = E[ln(1 + f x net_return)], where f is the fraction wagered. See: Bankroll Growth.

Gradient Descent — An iterative optimization algorithm that updates parameters in the direction of steepest decrease of the loss function. theta_new = theta_old - lr x gradient(L). The standard training algorithm for neural network-based betting models. See: Reinforcement Learning.

H

Half-Kelly — Betting 50% of the full Kelly recommended amount. Reduces variance by approximately 75% while sacrificing only ~25% of the geometric growth rate. The most common fractional Kelly variant used in practice. See: Kelly Criterion.

Handle — Total amount wagered on a market or event. High handle indicates high liquidity. Agent decisions should factor in handle relative to position size to minimize market impact.

Hedge — A bet placed to reduce exposure to an existing position. In prediction markets, selling part of a YES position when the price has risen locks in partial profit. Hedging reduces both upside and downside.

Hold Percentage — The sportsbook’s actual realized margin on total handle. Hold% = sportsbook_profit / total_handle. Differs from theoretical overround because it depends on how action distributes across outcomes. See: How to Calculate Vig.

Hosmer-Lemeshow Test — A goodness-of-fit test for logistic regression models. Groups predictions into deciles and compares observed vs. expected event rates using a chi-squared statistic. A p-value < 0.05 indicates poor calibration. See: Calibration and Model Evaluation.

I

Implied Probability — The probability derived from market odds or prices. For American odds: see American Odds entry. For decimal odds: p = 1/decimal. For prediction markets: p = contract_price. Includes the bookmaker’s margin unless adjusted. See: Prediction Market Math 101.

Independent Events — Two events where the occurrence of one does not affect the probability of the other. P(A and B) = P(A) x P(B) only if independent. Parlay pricing assumes independence — when it fails, correlated parlays are mispriced. See: Correlation Risk in Parlays.

Information Theory — The mathematical framework for quantifying information, uncertainty, and communication. Founded by Claude Shannon. Applied to betting via entropy, KL divergence, and mutual information to measure edge and model quality. See: Information Theory and Betting.

J

Joint Probability — The probability of two events both occurring. P(A and B) = P(A) x P(B|A). For independent events, simplifies to P(A) x P(B). Multi-leg bets depend on correctly estimating joint probabilities. See: Multi-Outcome Markets.

Juice — Synonym for vigorish (vig). The commission a sportsbook charges on losing bets. Standard juice is -110 on both sides of a spread, giving the book a 4.76% margin. See: Sports Betting Math 101.

K

Kalshi — A CFTC-regulated event contract exchange based in the US. Contracts are priced in cents (1-99) representing probability percentages. Uses a matching engine rather than an AMM. See: Prediction Market API Reference.

Kelly Criterion — The optimal fraction of bankroll to wager on a positive-EV bet to maximize geometric growth rate. f* = (bp - q) / b, where b = decimal_odds - 1, p = true win probability, q = 1 - p. Derived by John L. Kelly Jr. in 1956. See: Kelly Criterion.

KL Divergence (Kullback-Leibler Divergence) — A measure of how one probability distribution differs from a reference distribution. D_KL(P||Q) = sum(P(i) x ln(P(i)/Q(i))). Not symmetric: D_KL(P||Q) != D_KL(Q||P). Used to measure edge — the divergence between an agent’s model and market prices. See: Information Theory and Betting.

L

Likelihood — The probability of observed data given a parameter value. L(theta|data) = P(data|theta). Used in maximum likelihood estimation and as the update factor in Bayesian inference. See: Bayesian Updating.

Line — The odds, spread, or total set by a sportsbook on a particular event. “The line moved from -3 to -3.5” means the point spread increased. See: Line Movement Analysis.

Line Movement — The change in odds or spread between the opening line and current (or closing) line. Driven by betting action, news, and sharp money. Directional line movement indicates where informed money is flowing. See: Line Movement Analysis.

LMSR (Logarithmic Market Scoring Rule) — An automated market maker invented by Robin Hanson. Cost function: C(q) = b x ln(sum(e^(q_i/b))), where q_i is the quantity of shares sold for outcome i and b is the liquidity parameter. Price for outcome i: p_i = e^(q_i/b) / sum(e^(q_j/b)). See: LMSR Math.

Log-Loss — A scoring rule for probabilistic predictions. LogLoss = -(1/N) x sum(y_i x ln(p_i) + (1-y_i) x ln(1-p_i)). Equivalent to cross-entropy. Heavily penalizes confident wrong predictions. See: Prediction Market Scoring Rules.

Logistic Regression — A regression model where the dependent variable is binary. P(Y=1) = 1/(1 + e^(-z)) where z = beta_0 + beta_1 x x_1 + … + beta_n x x_n. The workhorse model for binary outcome prediction in sports betting. See: Regression Models.

Long Shot Bias — The empirical tendency for bettors to overvalue unlikely outcomes, causing longshot prices to be higher (and favorites lower) than true probabilities warrant. Creates systematic edge for agents betting favorites. See: EMH in Prediction Markets.

M

Margin — See Overround.

Marginal Probability — The probability of an event irrespective of other variables. P(A) = sum(P(A and B_i)) for all values of B. Obtained by summing (or integrating) the joint distribution over the other variable.

Market Depth — The volume of orders at each price level in an orderbook. Deep markets have large quantities at each price, making it hard for a single order to move the price. Shallow markets are easily manipulated. See: Prediction Market Microstructure.

Market Impact — The price movement caused by executing a trade. Large orders in thin markets push the price away from the trader, increasing effective cost. Agent position sizing must account for market impact. See: Prediction Market Microstructure.

Market Maker — An entity that provides liquidity by continuously quoting bid and ask prices. Profits from the spread. In prediction markets, can be automated (LMSR) or human. See: LMSR Math.

Market Manipulation — Deliberate trading activity designed to move prices to a desired level, not to reflect genuine information. Detectable via abnormal volume patterns, price reversions, and statistical anomalies. See: Market Manipulation Detection.

Markov Chain — A stochastic process where the probability of the next state depends only on the current state, not the history. Transition matrix P defines P(X_{t+1} = j | X_t = i) = P_{ij}. Applied to baseball run expectancy and win probability modeling. See: MLB Run Expectancy.

Martingale — A betting strategy that doubles the stake after each loss. Guaranteed to eventually recover losses — if bankroll is infinite. With finite bankroll, leads to ruin with probability approaching 1 over time. Negative-EV strategy disguised as a system.

Maximum Likelihood Estimation (MLE) — A parameter estimation method that finds the parameter values maximizing the likelihood of observed data. theta_MLE = argmax L(theta|data). Standard fitting procedure for Poisson models, logistic regression, and Elo calibration. See: Poisson Distribution.

Mean Reversion — The tendency of extreme outcomes to return toward the long-run average. A team that wins 12 of its first 14 games will likely regress closer to its true win rate. Critical for model inputs and avoiding overreaction to small samples. See: Regression Models.

Minimax — A game theory strategy that minimizes the maximum possible loss. In zero-sum games, the minimax strategy is the Nash equilibrium. Relevant for adversarial prediction market agents trading against informed counterparties. See: Game Theory.

Model Calibration — See Calibration.

Money Line — A bet on which team wins outright, with no point spread. Expressed in American odds: -150 (bet $150 to win $100) for the favorite, +130 (bet $100 to win $130) for the underdog. See: Sports Betting Math 101.

Monte Carlo Simulation — A computational technique that uses repeated random sampling to estimate the distribution of possible outcomes. For betting: simulate thousands of bankroll paths under a given strategy to estimate expected growth, drawdown, and ruin probability. See: Monte Carlo Simulation.

Multi-Armed Bandit — A sequential decision problem where an agent must choose between k options (arms) with unknown reward distributions. The agent balances exploration (learning about arms) and exploitation (pulling the best-known arm). Applied to sportsbook selection and strategy allocation. See: Multi-Armed Bandits.

Mutual Information — The reduction in uncertainty about one variable given knowledge of another. I(X;Y) = H(X) - H(X|Y) = H(Y) - H(Y|X). Used for feature selection in sports prediction models. See: Information Theory.

N

Nash Equilibrium — A state in a game where no player can improve their payoff by unilaterally changing their strategy. In prediction market agent competition, Nash equilibrium determines how aggressively rational agents trade against each other. See: Game Theory.

Negative Binomial Distribution — The distribution of the number of failures before the r-th success. Used for modeling events like “number of at-bats before a home run.” See: Probability Distributions Cheat Sheet.

No-Arbitrage Condition — See Completeness Condition.

Normal Distribution — The Gaussian bell curve. Defined by mean mu and standard deviation sigma. P(X=x) = (1/(sigma x sqrt(2pi))) x e^(-(x-mu)^2/(2sigma^2)). The Central Limit Theorem guarantees that sample means approach normality. Used for modeling point spread margins and totals. See: Probability Distributions Cheat Sheet.

O

Odds — A numerical expression of the likelihood of an outcome and the associated payout. Formats include American (-110), decimal (1.91), and fractional (10/11). All are mathematically interconvertible. See: Sports Betting Math 101.

Orderbook — The data structure listing all open buy and sell orders for a contract, sorted by price. The bid side shows buyers; the ask side shows sellers. Polymarket and Kalshi both expose orderbook data via API. See: Prediction Market Microstructure.

Overround — The amount by which the sum of implied probabilities for all outcomes exceeds 100%. Overround = sum(implied_probs) - 1. A standard -110/-110 line has overround = 2 x (110/210) - 1 = 0.0476 = 4.76%. Also called vig, juice, or margin. See: How to Calculate Vig.

P

p-Value — The probability of observing a test statistic as extreme as (or more extreme than) the observed value, assuming the null hypothesis is true. A p-value < 0.05 is conventionally called “statistically significant.” In betting: the probability that observed profit is due to chance rather than genuine edge. See: Statistical Significance.

Parlay — See Accumulator. A multi-leg bet paying the product of individual odds. Sportsbooks overprice parlays by assuming independence when outcomes are correlated. See: Correlation Risk in Parlays.

Poisson Distribution — A discrete distribution modeling the number of events in a fixed interval. P(X=k) = (lambda^k x e^(-lambda)) / k!, where lambda is the expected number of events. The foundation of score prediction in soccer, hockey, and baseball. See: Poisson Distribution.

Polymarket — A blockchain-based prediction market on Polygon. Uses a CLOB for order matching. Contracts priced from $0.00 to $1.00. Charges ~2% fee on net winnings. The largest prediction market by volume as of 2025-2026. See: Polymarket API Guide.

Posterior — The updated probability distribution after incorporating new evidence via Bayes’ theorem. Posterior is proportional to likelihood times prior. See: Bayesian Updating.

Power Method — An overround removal technique that adjusts implied probabilities using an exponent parameter. More accurate than the multiplicative method for markets with favorite-longshot bias. See: Sports Betting Math 101.

Prior — The probability distribution representing beliefs before observing new evidence. In Bayesian updating, the prior is combined with the likelihood to produce the posterior. See: Bayesian Updating.

Proper Scoring Rule — A scoring rule where the expected score is maximized when the forecaster reports their true beliefs. Brier score and logarithmic score are proper. Accuracy (threshold-based) is not. Proper scoring rules incentivize honest probability reporting. See: Prediction Market Scoring Rules.

Push — A bet that results in neither a win nor a loss, typically when the outcome exactly matches the spread. The stake is returned. Relevant for integer spread lines.

Q

Q-Learning — A model-free reinforcement learning algorithm that learns the optimal action-value function Q(s, a) from experience. Update rule: Q(s,a) = Q(s,a) + alpha x (r + gamma x max_a’(Q(s’,a’)) - Q(s,a)). Used for learning dynamic bet timing policies. See: Reinforcement Learning.

Q-Value — In reinforcement learning, the expected cumulative discounted reward from taking action a in state s and following the optimal policy thereafter. Q(s,a) = E[sum(gamma^t x r_t) | s_0=s, a_0=a]. See: Reinforcement Learning.

Quarter-Kelly — Betting 25% of the full Kelly recommended amount. Provides approximately 56% of the growth rate with roughly 94% reduction in drawdown risk compared to full Kelly. Conservative but safe. See: Kelly Criterion.

R

Regression to the Mean — The statistical phenomenon where extreme observations tend to be followed by observations closer to the population mean. Critical for adjusting small-sample performance data before feeding it to models. See: Regression Models.

Regularization — Techniques that constrain model parameters to prevent overfitting. L1 (Lasso) adds sum(|beta_i|) penalty; L2 (Ridge) adds sum(beta_i^2) penalty. Elastic net combines both. See: Regression Models.

Reinforcement Learning (RL) — A machine learning paradigm where an agent learns to maximize cumulative reward through trial-and-error interaction with an environment. Components: state, action, reward, policy, value function. Applied to dynamic bet timing and market making. See: Reinforcement Learning.

Reliability Diagram — A plot of observed frequency vs. predicted probability across binned prediction ranges. A perfectly calibrated model produces points along the y=x diagonal. See: Calibration and Model Evaluation.

Return on Investment (ROI) — The ratio of profit to total amount wagered. ROI = profit / total_wagered x 100%. A 3% ROI means $3 profit per $100 wagered. The standard profitability metric for betting systems.

Ridge Regression — Linear regression with L2 regularization. Minimizes sum(y_i - x_i^T x beta)^2 + lambda x sum(beta_j^2). Shrinks coefficients toward zero without setting them exactly to zero. See: Regression Models.

Risk of Ruin — The probability of losing an entire bankroll. For a biased random walk with edge p and per-bet fraction f: P(ruin) approaches ((1-p)/p)^(bankroll/bet_size) for fixed bet sizing. Full Kelly has theoretical ruin probability of 0; in practice, estimation errors make fractional Kelly necessary. See: Drawdown Math.

Rollover — The requirement to bet a certain multiple of a bonus amount before withdrawing. A 10x rollover on a $100 bonus means wagering $1,000 total before the bonus becomes withdrawable. See: Sportsbook Rollover Explained.

Run Expectancy — The expected number of runs scored from a given base-out state, calculated from historical Markov chain transition matrices. The foundation of baseball analytical modeling. See: MLB Run Expectancy.

S

Sample Size — The number of observations in a dataset or the number of bets in a track record. Small samples produce unreliable estimates. In sports betting, 500-1000+ bets at similar odds are typically needed to establish statistical significance for a claimed edge. See: Statistical Significance.

Scoring Rule — A function that maps a probability forecast and a realized outcome to a numerical score measuring forecast quality. Proper scoring rules incentivize truthful reporting. See: Prediction Market Scoring Rules.

Sharpe Ratio — Risk-adjusted return metric. Sharpe = (mean_return - risk_free_rate) / std_dev_returns. Higher is better. A Sharpe above 1.0 is good; above 2.0 is excellent. Applied to betting by treating each bet’s return as a portfolio observation. See: Drawdown Math.

Sharp Bettor — A professional bettor whose action moves lines at sportsbooks. Sharp bettors generate positive CLV consistently. Sportsbooks limit or ban sharps; offshore books like Bookmaker are more tolerant. See: Closing Line Value.

Shin’s Method — An overround removal technique developed by Hyun Song Shin that models the probability of insider trading. Produces more accurate true probabilities than the multiplicative method, especially in markets with favorite-longshot bias. See: Sports Betting Math 101.

Simultaneous Kelly — The extension of Kelly criterion to multiple concurrent bets. Requires solving a convex optimization problem: maximize sum(ln(1 + sum(f_j x outcome_ji))) subject to constraints. No closed-form solution for correlated outcomes. See: Kelly Criterion.

Slippage — The difference between the expected execution price and the actual fill price. Occurs in markets with thin liquidity when order size exceeds available depth at the best price. See: Prediction Market Microstructure.

Softmax — A function that converts a vector of real numbers into a probability distribution. softmax(z_i) = e^(z_i) / sum(e^(z_j)). The LMSR price function is a softmax over quantities. Also used as the output layer for multi-class prediction models. See: LMSR Math.

Spread — (1) Point spread: the handicap applied to the favorite to equalize betting on both sides. “Lakers -3.5” means the Lakers must win by 4+ points. (2) Bid-ask spread: see Bid-Ask Spread.

Standard Deviation — The square root of variance. sigma = sqrt(Var(X)). Measures the typical deviation of outcomes from the mean. For bet returns, larger standard deviation means more volatile results. See: Drawdown Math.

Stationarity — A property of a stochastic process where statistical properties (mean, variance, autocorrelation) do not change over time. Most sports betting data is non-stationary — team strength evolves, market conditions shift. Walk-forward validation addresses this. See: Feature Engineering.

Steam Move — A sudden, significant line movement caused by sharp money hitting multiple sportsbooks simultaneously. Steam moves indicate coordinated professional action and often represent genuine edge. See: Line Movement Analysis.

T

Thompson Sampling — A Bayesian multi-armed bandit algorithm that selects actions by sampling from posterior distributions of each arm’s reward. For each arm, sample from its posterior; play the arm with the highest sample. Naturally balances exploration and exploitation. See: Multi-Armed Bandits.

Total (Over/Under) — A bet on whether the combined score of both teams exceeds or falls below a specified number. Modeled using Poisson or normal distributions depending on the sport. See: Poisson Distribution.

Transition Matrix — A matrix where entry P_{ij} gives the probability of transitioning from state i to state j. The core data structure for Markov chain models. See: MLB Run Expectancy.

True Probability — The agent’s best estimate of the actual probability of an event, as opposed to the market-implied probability. Edge exists when true probability diverges from implied probability. All EV calculations depend on accurate true probability estimates.

U

UCB1 (Upper Confidence Bound) — A multi-armed bandit algorithm that selects the arm maximizing: UCB_i = x_bar_i + sqrt(2 x ln(n) / n_i), where x_bar_i is the mean reward, n is the total number of plays, and n_i is the number of times arm i was played. Balances exploration via the confidence term. See: Multi-Armed Bandits.

Utility Function — A mathematical function mapping outcomes to subjective value. Risk-averse agents have concave utility (e.g., log-utility, which Kelly criterion maximizes). Risk-neutral agents have linear utility (maximizes EV regardless of variance). See: Kelly Criterion.

V

Value Bet — A bet where the true probability exceeds the implied probability. Equivalently, a bet with positive expected value. All profitable betting strategies reduce to consistently identifying and executing value bets. See: Expected Value.

Variance — The expected squared deviation from the mean. Var(X) = E[(X - mu)^2] = E[X^2] - (E[X])^2. For a sequence of bets, variance determines bankroll volatility and drawdown magnitude. See: Drawdown Math.

Vig (Vigorish) — The bookmaker’s commission built into odds. For a standard -110/-110 line: each side has implied probability 52.38%, summing to 104.76%. The 4.76% overround is the vig. The AgentBets Vig Index tracks vig across sportsbooks in real time. See: How to Calculate Vig.

Volatility — The degree of variation in market prices or bet returns over time. Measured by standard deviation of returns. High volatility increases drawdown risk and demands smaller position sizes. See: Crypto and DeFi Prediction Markets.

W

Walk-Forward Validation — A time-series cross-validation method that trains on historical data and tests on the next chronological period, then rolls the window forward. Prevents look-ahead bias that standard k-fold CV introduces in temporal data. See: Feature Engineering.

Win Expectancy — The probability that a team wins from a given game state (score, inning, outs, base runners). Calculated from historical Markov chain transition matrices or real-time models. The foundation of live betting models. See: NBA Win Probability.

Win Probability Added (WPA) — The change in a team’s win probability resulting from a specific play or event. WPA = win_prob_after - win_prob_before. Used to quantify player impact and evaluate in-game bet triggers.

World Cup Group Stage Math — The combinatorial and probabilistic analysis of tournament group stage outcomes. With 4 teams per group playing 6 matches, there are thousands of possible outcome combinations. Monte Carlo simulation is the standard approach. See: World Cup 2026 Betting Math.

X

xG (Expected Goals) — See Expected Goals.

Y

Yield — Profit divided by total amount wagered, expressed as a percentage. Yield = profit / turnover x 100%. Synonymous with ROI in many contexts. A yield of +3% means $3 profit per $100 wagered.

Z

z-Score — The number of standard deviations an observation is from the mean. z = (X - mu) / sigma. Used for hypothesis testing: a z-score above 1.96 (or below -1.96) indicates statistical significance at the 5% level. In betting, used to test whether observed profit differs from zero. See: Statistical Significance.

Zero-Sum Game — A game where one participant’s gain is exactly another’s loss. Sports betting is zero-sum (minus the vig). Prediction market trading is zero-sum among participants (minus fees). See: Game Theory.

Implementation

The glossary is also useful as a programmatic lookup for agents. Here is a Python implementation that loads term definitions and their associated series links:

from dataclasses import dataclass, field
import json


@dataclass
class GlossaryEntry:
    """A single glossary term with definition and metadata."""
    term: str
    definition: str
    formula: str = ""
    aliases: list[str] = field(default_factory=list)
    related_guide: str = ""
    layer: str = ""


def build_glossary() -> dict[str, GlossaryEntry]:
    """
    Build the core glossary data structure for agent use.
    Returns a dict keyed by lowercase term name.
    """
    entries = [
        GlossaryEntry(
            term="Expected Value",
            definition="Probability-weighted average of all possible outcomes.",
            formula="EV = sum(p_i * payoff_i) - cost",
            aliases=["EV"],
            related_guide="/guides/expected-value-prediction-markets/",
            layer="Layer 4 — Intelligence",
        ),
        GlossaryEntry(
            term="Kelly Criterion",
            definition="Optimal fraction of bankroll to wager on a positive-EV bet.",
            formula="f* = (bp - q) / b",
            aliases=["Kelly", "Kelly formula"],
            related_guide="/guides/kelly-criterion-bet-sizing/",
            layer="Layer 4 — Intelligence",
        ),
        GlossaryEntry(
            term="Vigorish",
            definition="Bookmaker commission built into odds.",
            formula="overround = sum(implied_probs) - 1",
            aliases=["vig", "juice", "margin"],
            related_guide="/guides/how-to-calculate-vig/",
            layer="Layer 3 — Trading",
        ),
        GlossaryEntry(
            term="Closing Line Value",
            definition="Difference between bet price and closing price implied probabilities.",
            formula="CLV = implied_prob_close - implied_prob_bet",
            aliases=["CLV"],
            related_guide="/guides/closing-line-value-clv/",
            layer="Layer 4 — Intelligence",
        ),
        GlossaryEntry(
            term="LMSR",
            definition="Logarithmic Market Scoring Rule automated market maker.",
            formula="C(q) = b * ln(sum(e^(q_i/b)))",
            aliases=["Logarithmic Market Scoring Rule"],
            related_guide="/guides/lmsr-automated-market-maker-math/",
            layer="Layer 3 — Trading",
        ),
        GlossaryEntry(
            term="Brier Score",
            definition="Proper scoring rule for probabilistic forecast accuracy.",
            formula="BS = (1/N) * sum((f_i - o_i)^2)",
            aliases=[],
            related_guide="/guides/prediction-market-scoring-rules/",
            layer="Layer 4 — Intelligence",
        ),
        GlossaryEntry(
            term="Poisson Distribution",
            definition="Discrete distribution for event counts in a fixed interval.",
            formula="P(X=k) = (lambda^k * e^(-lambda)) / k!",
            aliases=[],
            related_guide="/guides/poisson-distribution-sports-modeling/",
            layer="Layer 4 — Intelligence",
        ),
        GlossaryEntry(
            term="Bayesian Updating",
            definition="Revising probability estimates with new evidence via Bayes' theorem.",
            formula="P(H|E) = P(E|H) * P(H) / P(E)",
            aliases=["Bayes' theorem", "Bayesian inference"],
            related_guide="/guides/bayesian-updating-prediction-markets/",
            layer="Layer 4 — Intelligence",
        ),
        GlossaryEntry(
            term="Drawdown",
            definition="Decline from peak bankroll value to subsequent trough.",
            formula="drawdown = (peak - trough) / peak",
            aliases=["max drawdown"],
            related_guide="/guides/drawdown-math-variance-betting/",
            layer="Layer 2 — Wallet",
        ),
        GlossaryEntry(
            term="Sharpe Ratio",
            definition="Risk-adjusted return metric.",
            formula="Sharpe = (mean_return - risk_free_rate) / std_dev_returns",
            aliases=[],
            related_guide="/guides/drawdown-math-variance-betting/",
            layer="Layer 2 — Wallet",
        ),
    ]

    glossary = {}
    for entry in entries:
        key = entry.term.lower()
        glossary[key] = entry
        for alias in entry.aliases:
            glossary[alias.lower()] = entry

    return glossary


def lookup(glossary: dict[str, GlossaryEntry], query: str) -> GlossaryEntry | None:
    """
    Look up a term in the glossary. Case-insensitive.
    Returns the GlossaryEntry or None if not found.
    """
    return glossary.get(query.lower())


def export_glossary_json(glossary: dict[str, GlossaryEntry], path: str) -> None:
    """Export glossary to JSON for agent consumption."""
    unique_entries = {}
    for entry in glossary.values():
        if entry.term not in unique_entries:
            unique_entries[entry.term] = {
                "term": entry.term,
                "definition": entry.definition,
                "formula": entry.formula,
                "aliases": entry.aliases,
                "related_guide": entry.related_guide,
                "layer": entry.layer,
            }
    with open(path, "w") as f:
        json.dump(list(unique_entries.values()), f, indent=2)


# Usage example
if __name__ == "__main__":
    glossary = build_glossary()

    # Lookup by term
    ev = lookup(glossary, "Expected Value")
    if ev:
        print(f"Term: {ev.term}")
        print(f"Definition: {ev.definition}")
        print(f"Formula: {ev.formula}")
        print(f"Guide: {ev.related_guide}")
        print(f"Layer: {ev.layer}")
        print()

    # Lookup by alias
    kelly = lookup(glossary, "Kelly")
    if kelly:
        print(f"Term: {kelly.term}")
        print(f"Formula: {kelly.formula}")
        print()

    # Export to JSON
    export_glossary_json(glossary, "glossary.json")
    print("Exported glossary to glossary.json")
    print(f"Total unique terms: {len(set(id(v) for v in glossary.values()))}")

Limitations and Edge Cases

This glossary covers the terminology used within the AgentBets Math Behind Betting series. It does not cover every term in quantitative finance, probability theory, or statistics — only the subset relevant to building autonomous betting agents.

Several limitations apply. Definitions are tuned for the betting and prediction market context. “Spread” in this glossary means either a point spread or bid-ask spread, not the statistical measure of dispersion. “Alpha” means significance level or betting edge, not the intercept term in a CAPM model. Context determines which definition applies.

Formulas are presented in simplified form. The Kelly formula f* = (bp - q) / b assumes a single binary bet with known probability. The Poisson PMF assumes events are independent and identically distributed. Real-world applications require the extended versions documented in the individual guides.

Some terms have multiple valid definitions across different fields. This glossary uses the definition most relevant to agent developers working at the intersection of prediction markets, sportsbooks, and autonomous systems.

FAQ

What are the most important math terms for sports betting?

The essential terms are expected value (EV = sum of probability times payoff minus cost), implied probability (the probability embedded in odds or prices), vigorish (the bookmaker’s margin), Kelly criterion (optimal bet sizing formula f* = (bp-q)/b), and closing line value (the difference between your bet price and the closing price). These five concepts form the foundation of all quantitative betting.

What is the difference between vig, juice, and overround?

Vig (vigorish), juice, and overround all describe the bookmaker’s built-in margin, but they measure it differently. Vig and juice are interchangeable terms for the commission a sportsbook charges. Overround is the specific measurement: the amount by which implied probabilities sum above 100%. A standard -110/-110 line has an overround of 4.76%.

How do you calculate implied probability from American odds?

For negative American odds (favorites): implied probability = |odds| / (|odds| + 100). For positive American odds (underdogs): implied probability = 100 / (odds + 100). For example, -150 implies 150/250 = 60%, and +200 implies 100/300 = 33.3%. See the Sports Betting Math 101 guide for all odds format conversions.

What math do autonomous betting agents need to know?

Autonomous betting agents require probability theory (Bayesian updating, conditional probability), optimization (Kelly criterion, convex optimization), statistics (regression, hypothesis testing, calibration), market microstructure (orderbooks, LMSR, spread analysis), and risk management (drawdown limits, portfolio correlation, bankroll growth). The AgentBets Math Behind Betting series covers all 40 topics across these domains.

What is the Kelly Criterion formula and what do the variables mean?

The Kelly Criterion is f* = (bp - q) / b, where f* is the fraction of bankroll to wager, b is the decimal odds minus 1 (the net payout per dollar), p is the true probability of winning, and q = 1 - p is the probability of losing. Kelly maximizes the geometric growth rate of bankroll over repeated bets. See the Kelly Criterion guide for derivations and fractional Kelly variants.

What’s Next

This glossary is the final piece in the AgentBets Math Behind Betting series. Here’s where to go from here: