Agent Intelligence Guide: LLM Analysis for Prediction Markets

The intelligence layer is the brain of your prediction market agent. Layers 1 through 3 give your agent an identity, a wallet, and the ability to execute trades. Layer 4 decides what to trade and when.

This is the hardest layer to build and the one with the most variation. There is no single correct architecture — the best agents combine multiple signal sources, calibrate confidence carefully, and adapt their strategy to market conditions. This guide walks through every component you need, with working code for each.

Prerequisites

Before building your intelligence layer, you should have:

A working Layer 3 setup (either Polymarket or Kalshi)
Python 3.10+ with pip available
An API key for at least one LLM provider (Anthropic recommended)
Familiarity with prediction market mechanics (see the glossary)

Install the core dependencies used throughout this guide:

pip install anthropic httpx pydantic numpy

The Core Decision Loop

Every prediction market agent, regardless of strategy, follows the same loop:

┌──────────────────────────────────────────────────┐
│                 DECISION LOOP                     │
│                                                   │
│   ┌─────────┐    ┌─────────┐    ┌──────────┐    │
│   │ OBSERVE │───▶│ ANALYZE │───▶│  DECIDE  │    │
│   │         │    │         │    │          │    │
│   │ Market  │    │ LLM +   │    │ Edge +   │    │
│   │ data,   │    │ signals │    │ sizing   │    │
│   │ news,   │    │ + Bayes │    │ logic    │    │
│   │ social  │    │         │    │          │    │
│   └─────────┘    └─────────┘    └────┬─────┘    │
│                                      │           │
│   ┌─────────┐                        │           │
│   │ EXECUTE │◀───────────────────────┘           │
│   │         │                                    │
│   │ Layer 3 │                                    │
│   │ trade   │                                    │
│   └─────────┘                                    │
└──────────────────────────────────────────────────┘

Observe: Pull market data from Polymarket or Kalshi APIs, fetch news and social signals, check your current positions.

Analyze: Feed observations into your intelligence pipeline — LLM evaluation, sentiment scoring, Bayesian updates, signal aggregation.

Decide: Compare your estimated probability to the market price. If the edge exceeds your threshold, calculate position size.

Execute: Send orders through Layer 3. Log the decision and result for future backtesting.

The rest of this guide breaks down the Analyze and Decide phases in detail.

LLM Prompt Patterns for Market Evaluation

LLMs are the fastest way to get a working intelligence layer. A well-crafted prompt can evaluate a prediction market with surprising accuracy — especially for markets that depend on reasoning about public information rather than private data.

Basic Market Analysis Prompt

The simplest useful pattern sends market metadata to an LLM and asks for a probability estimate:

import anthropic
import json

client = anthropic.Anthropic()  # uses ANTHROPIC_API_KEY env var

def evaluate_market(question: str, yes_price: float, volume_24h: float,
                    end_date: str, description: str) -> dict:
    """Ask an LLM to evaluate a prediction market and return structured analysis."""

    prompt = f"""You are an expert prediction market analyst. Evaluate this market
and provide your independent probability estimate.

Market question: {question}
Current Yes price: ${yes_price:.2f} (implies {yes_price * 100:.1f}% probability)
24h volume: ${volume_24h:,.0f}
Resolution date: {end_date}
Description: {description}

Instructions:
1. Reason step-by-step about the likely outcome
2. Consider base rates, recent developments, and known factors
3. Assign your probability estimate for Yes (0.0 to 1.0)
4. Rate your confidence in this estimate (1-10)
5. If your estimate differs from the market by more than 5 percentage points,
   explain why you think the market is wrong

Respond with valid JSON only:
{{
  "reasoning": "your step-by-step analysis",
  "probability": 0.XX,
  "confidence": N,
  "edge_direction": "yes" | "no" | "none",
  "edge_explanation": "why you disagree with the market, or null"
}}"""

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": prompt}]
    )

    return json.loads(response.content[0].text)

This works, but it has limitations. The LLM’s knowledge has a cutoff date, it can hallucinate confidence, and it has no access to real-time information. The sections below address each of these.

Structured Output with Pydantic

For production agents, enforce output structure with Pydantic so your downstream code never breaks on malformed LLM responses:

from pydantic import BaseModel, Field

class MarketAnalysis(BaseModel):
    reasoning: str = Field(description="Step-by-step analysis")
    probability: float = Field(ge=0.0, le=1.0, description="Estimated probability of Yes")
    confidence: int = Field(ge=1, le=10, description="Confidence in estimate")
    edge_direction: str = Field(description="yes, no, or none")
    edge_explanation: str | None = Field(default=None)

def evaluate_market_structured(question: str, yes_price: float,
                                context: str) -> MarketAnalysis:
    """Evaluate a market with guaranteed structured output."""

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{
            "role": "user",
            "content": f"""Analyze this prediction market and respond with JSON matching
this schema: {MarketAnalysis.model_json_schema()}

Question: {question}
Current Yes price: {yes_price}
Additional context: {context}"""
        }]
    )

    return MarketAnalysis.model_validate_json(response.content[0].text)

Prompt Chaining: Research → Analyze → Decide

A single prompt can only use the LLM’s training data. For better results, chain multiple calls where earlier calls gather information that later calls analyze:

async def research_and_analyze(question: str, yes_price: float) -> MarketAnalysis:
    """Three-stage prompt chain: research, analyze, decide."""

    # Stage 1: Research — identify what information matters
    research_prompt = f"""For this prediction market question, list the 5 most important
factors that would determine the outcome. For each factor, describe what data
source could verify it (news API, social media, government data, etc.).

Question: {question}

Respond with JSON: [{{"factor": "...", "data_source": "...", "importance": 1-5}}]"""

    research = client.messages.create(
        model="claude-haiku-4-5-20251001",  # fast + cheap for research
        max_tokens=512,
        messages=[{"role": "user", "content": research_prompt}]
    )
    factors = json.loads(research.content[0].text)

    # Stage 2: Gather data for each factor (see Sentiment section below)
    context = await gather_signals(factors)

    # Stage 3: Final analysis with full context
    analysis_prompt = f"""You are analyzing a prediction market with real-time data.

Question: {question}
Current Yes price: {yes_price}

Research factors and findings:
{json.dumps(context, indent=2)}

Based on this evidence, provide your probability estimate.
Respond with JSON: {MarketAnalysis.model_json_schema()}"""

    response = client.messages.create(
        model="claude-sonnet-4-6",  # best model for final decision
        max_tokens=1024,
        messages=[{"role": "user", "content": analysis_prompt}]
    )

    return MarketAnalysis.model_validate_json(response.content[0].text)

Using claude-haiku-4-5-20251001 for research keeps costs low. Reserve claude-sonnet-4-6 or claude-opus-4-6 for the final analysis where accuracy matters most.

Model Selection

Model	Best For	Cost	Latency
`claude-opus-4-6`	Complex reasoning, multi-factor analysis	Highest	Slowest
`claude-sonnet-4-6`	Good balance of accuracy and speed	Medium	Medium
`claude-haiku-4-5-20251001`	Research, summarization, data extraction	Lowest	Fastest
GPT-4o	Alternative to Sonnet, comparable accuracy	Medium	Medium
Open-source (Llama, Mistral)	Self-hosted, no API costs, full control	Compute only	Variable

For most agents, the sweet spot is Haiku for research stages and Sonnet for final analysis. Use Opus only for high-stakes markets where the edge needs to be precise.

Sentiment Analysis Pipelines

LLMs reason well from their training data, but prediction markets move on new information. Sentiment analysis gives your agent a real-time view of public opinion.

Architecture

┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐
│ X/Twitter│  │  Reddit  │  │  News    │  │ Moltbook │
│   API    │  │   API    │  │  APIs    │  │   Feed   │
└────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘
     │             │             │             │
     ▼             ▼             ▼             ▼
┌──────────────────────────────────────────────────┐
│              Sentiment Scorer                     │
│  (LLM classifies each item as bullish/bearish)   │
└──────────────────────┬───────────────────────────┘
                       │
                       ▼
              ┌─────────────────┐
              │  Weighted Score  │
              │  (-1.0 to +1.0) │
              └─────────────────┘

Fetching Signals

import httpx
from dataclasses import dataclass
from datetime import datetime

@dataclass
class Signal:
    source: str         # "twitter", "reddit", "news", "moltbook"
    text: str           # the content
    timestamp: datetime
    metadata: dict      # followers, upvotes, source credibility, etc.

async def fetch_twitter_signals(query: str, count: int = 20) -> list[Signal]:
    """Fetch recent tweets about a topic. Requires X API Bearer Token."""
    async with httpx.AsyncClient() as client:
        resp = await client.get(
            "https://api.x.com/2/tweets/search/recent",
            headers={"Authorization": f"Bearer {TWITTER_BEARER_TOKEN}"},
            params={
                "query": query,
                "max_results": count,
                "tweet.fields": "created_at,public_metrics,author_id"
            }
        )
        data = resp.json()
        return [
            Signal(
                source="twitter",
                text=tweet["text"],
                timestamp=datetime.fromisoformat(tweet["created_at"].rstrip("Z")),
                metadata=tweet.get("public_metrics", {})
            )
            for tweet in data.get("data", [])
        ]

async def fetch_reddit_signals(subreddit: str, query: str,
                                count: int = 20) -> list[Signal]:
    """Fetch recent Reddit posts. Uses public JSON endpoint."""
    async with httpx.AsyncClient() as client:
        resp = await client.get(
            f"https://www.reddit.com/r/{subreddit}/search.json",
            params={"q": query, "sort": "new", "limit": count, "t": "week"},
            headers={"User-Agent": "AgentBets/1.0"}
        )
        data = resp.json()
        return [
            Signal(
                source="reddit",
                text=f"{post['data']['title']} {post['data'].get('selftext', '')}",
                timestamp=datetime.fromtimestamp(post["data"]["created_utc"]),
                metadata={"score": post["data"]["score"],
                          "num_comments": post["data"]["num_comments"]}
            )
            for post in data["data"]["children"]
        ]

async def fetch_news_signals(query: str, count: int = 10) -> list[Signal]:
    """Fetch recent news articles via NewsAPI."""
    async with httpx.AsyncClient() as client:
        resp = await client.get(
            "https://newsapi.org/v2/everything",
            params={
                "q": query, "sortBy": "publishedAt",
                "pageSize": count, "apiKey": NEWS_API_KEY
            }
        )
        articles = resp.json().get("articles", [])
        return [
            Signal(
                source="news",
                text=f"{a['title']}. {a.get('description', '')}",
                timestamp=datetime.fromisoformat(a["publishedAt"].rstrip("Z")),
                metadata={"source_name": a["source"]["name"]}
            )
            for a in articles
        ]

Scoring Sentiment with an LLM

Rather than using traditional NLP sentiment libraries (which struggle with prediction market context), use a cheap LLM call to classify each signal:

async def score_signals(signals: list[Signal], market_question: str) -> float:
    """Score a batch of signals from -1.0 (bearish) to +1.0 (bullish)."""
    if not signals:
        return 0.0

    signal_text = "\n".join(
        f"[{s.source}] {s.text[:200]}" for s in signals[:30]  # cap context size
    )

    response = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=256,
        messages=[{
            "role": "user",
            "content": f"""Rate the overall sentiment of these signals regarding
this prediction market question:

Question: {market_question}

Signals:
{signal_text}

Score from -1.0 (strongly suggests No) to +1.0 (strongly suggests Yes).
Consider volume, recency, and source credibility.

Respond with JSON only: {{"score": X.XX, "reasoning": "brief explanation"}}"""
        }]
    )

    result = json.loads(response.content[0].text)
    return float(result["score"])

Signal Aggregation and Confidence Scoring

Individual signals are noisy. An agent that acts on a single tweet or one LLM analysis will lose money. Signal aggregation combines multiple sources into a single confidence-weighted estimate.

The Signal Aggregator

from dataclasses import dataclass, field

@dataclass
class WeightedSignal:
    name: str
    value: float       # probability estimate (0.0 to 1.0)
    confidence: float  # how much to trust this signal (0.0 to 1.0)
    weight: float      # base weight for this signal type

class SignalAggregator:
    """Combine multiple probability signals into a single estimate."""

    def __init__(self):
        self.signals: list[WeightedSignal] = []

    def add(self, name: str, value: float, confidence: float, weight: float = 1.0):
        self.signals.append(WeightedSignal(name, value, confidence, weight))

    def aggregate(self) -> dict:
        """Weighted average where each signal's influence is weight * confidence."""
        if not self.signals:
            return {"probability": 0.5, "confidence": 0.0, "n_signals": 0}

        total_weight = 0.0
        weighted_sum = 0.0

        for s in self.signals:
            effective_weight = s.weight * s.confidence
            weighted_sum += s.value * effective_weight
            total_weight += effective_weight

        if total_weight == 0:
            return {"probability": 0.5, "confidence": 0.0,
                    "n_signals": len(self.signals)}

        probability = weighted_sum / total_weight

        # Confidence increases with more agreeing signals
        agreement = 1.0 - self._signal_variance()
        avg_confidence = sum(s.confidence for s in self.signals) / len(self.signals)
        overall_confidence = min(agreement * avg_confidence, 1.0)

        return {
            "probability": round(probability, 4),
            "confidence": round(overall_confidence, 4),
            "n_signals": len(self.signals),
            "signals": [
                {"name": s.name, "value": s.value, "confidence": s.confidence}
                for s in self.signals
            ]
        }

    def _signal_variance(self) -> float:
        values = [s.value for s in self.signals]
        mean = sum(values) / len(values)
        return sum((v - mean) ** 2 for v in values) / len(values)

Usage Example

aggregator = SignalAggregator()

# LLM analysis (high weight — this is your primary signal)
llm_result = evaluate_market_structured(question, yes_price, context)
aggregator.add("llm_analysis", llm_result.probability,
               llm_result.confidence / 10, weight=3.0)

# Sentiment from social signals
sentiment = await score_signals(twitter_signals + reddit_signals, question)
sentiment_as_prob = (sentiment + 1.0) / 2.0  # convert -1..1 to 0..1
aggregator.add("social_sentiment", sentiment_as_prob, 0.5, weight=1.0)

# News sentiment
news_sentiment = await score_signals(news_signals, question)
news_as_prob = (news_sentiment + 1.0) / 2.0
aggregator.add("news_sentiment", news_as_prob, 0.6, weight=1.5)

# Polyseer analysis (if available)
polyseer_result = await get_polyseer_analysis(market_id)
if polyseer_result:
    aggregator.add("polyseer", polyseer_result["probability"],
                   polyseer_result["confidence"], weight=2.0)

result = aggregator.aggregate()
# {"probability": 0.62, "confidence": 0.71, "n_signals": 4, "signals": [...]}

Bayesian Probability Estimation

The market price is information. It represents the collective wisdom of all participants. Your agent should not ignore it — it should update from it. Bayesian estimation lets you start with the market’s probability (prior) and adjust based on your own evidence (likelihood).

How Bayesian Updating Works

Prior: Start with the market price as your initial probability estimate
Likelihood: For each piece of new evidence, estimate how likely that evidence would be if the outcome were Yes vs. No
Posterior: Apply Bayes’ theorem to get an updated probability

def bayesian_update(prior: float, evidence: list[tuple[float, float]]) -> float:
    """Update a probability estimate with new evidence.

    Args:
        prior: Starting probability (e.g., market price)
        evidence: List of (likelihood_if_yes, likelihood_if_no) tuples.
                  Each tuple represents one piece of evidence.

    Returns:
        Updated (posterior) probability.
    """
    p_yes = prior
    p_no = 1.0 - prior

    for likelihood_yes, likelihood_no in evidence:
        # Bayes' theorem: P(Yes|E) = P(E|Yes) * P(Yes) / P(E)
        p_evidence = likelihood_yes * p_yes + likelihood_no * p_no
        if p_evidence == 0:
            continue
        p_yes = (likelihood_yes * p_yes) / p_evidence
        p_no = 1.0 - p_yes

    return p_yes

Practical Example

Suppose a market asks “Will Company X announce layoffs this quarter?” and the current price is $0.30 (30% implied probability).

market_price = 0.30  # prior

evidence = [
    # Bloomberg reports hiring freeze → more likely if layoffs coming
    (0.8, 0.3),   # P(hiring freeze | layoffs) = 0.8, P(hiring freeze | no layoffs) = 0.3

    # CEO tweets "excited about growth" → less likely if layoffs coming
    (0.2, 0.7),   # P(growth tweet | layoffs) = 0.2, P(growth tweet | no layoffs) = 0.7

    # Glassdoor reviews mention "restructuring" → more likely if layoffs
    (0.7, 0.2),   # P(restructuring mentions | layoffs) = 0.7, P(no restructuring | no) = 0.2
]

posterior = bayesian_update(market_price, evidence)
# posterior ≈ 0.56 — your estimate is now 56%, market says 30%
# This is a potential 26-point edge on Yes

Estimating Likelihoods with an LLM

You can use an LLM to estimate the likelihood ratios for each piece of evidence:

def estimate_likelihood(evidence_text: str, market_question: str) -> tuple[float, float]:
    """Ask an LLM to estimate P(evidence | Yes) and P(evidence | No)."""
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=256,
        messages=[{
            "role": "user",
            "content": f"""Given this prediction market question and a piece of evidence,
estimate two probabilities:

Question: {market_question}
Evidence: {evidence_text}

1. P(evidence | Yes): How likely would this evidence be if the answer is Yes?
2. P(evidence | No): How likely would this evidence be if the answer is No?

Both should be between 0.01 and 0.99.

Respond with JSON: {{"p_evidence_given_yes": X.XX, "p_evidence_given_no": X.XX}}"""
        }]
    )
    result = json.loads(response.content[0].text)
    return (result["p_evidence_given_yes"], result["p_evidence_given_no"])

Edge Detection and When to Bet

Your agent should only bet when it has a meaningful edge — when its probability estimate differs enough from the market price to overcome fees and uncertainty.

Defining Edge

def calculate_edge(agent_probability: float, market_price: float,
                   confidence: float, platform_fee: float = 0.02) -> dict:
    """Calculate the edge and whether it's worth betting.

    Args:
        agent_probability: Your estimated probability of Yes
        market_price: Current Yes price on the platform
        confidence: Your confidence in the estimate (0-1)
        platform_fee: Platform fee as a fraction (Polymarket ≈ 2%, Kalshi ≈ 7%)

    Returns:
        Dict with edge calculations and recommendation.
    """
    raw_edge_yes = agent_probability - market_price
    raw_edge_no = (1 - agent_probability) - (1 - market_price)

    # Confidence-weighted edge
    weighted_edge_yes = raw_edge_yes * confidence
    weighted_edge_no = raw_edge_no * confidence

    # Account for fees
    net_edge_yes = weighted_edge_yes - platform_fee
    net_edge_no = weighted_edge_no - platform_fee

    # Determine best direction
    if net_edge_yes > net_edge_no and net_edge_yes > 0:
        direction = "yes"
        net_edge = net_edge_yes
    elif net_edge_no > 0:
        direction = "no"
        net_edge = net_edge_no
    else:
        direction = "none"
        net_edge = 0.0

    return {
        "direction": direction,
        "raw_edge": round(max(raw_edge_yes, raw_edge_no), 4),
        "confidence_weighted_edge": round(max(weighted_edge_yes, weighted_edge_no), 4),
        "net_edge_after_fees": round(net_edge, 4),
        "should_bet": net_edge > 0.03,  # minimum 3% net edge threshold
        "agent_probability": agent_probability,
        "market_price": market_price,
    }

Edge Thresholds

Not every positive edge is worth trading. Set minimum thresholds based on your confidence level:

Confidence	Minimum Net Edge	Reasoning
8-10	3%	High confidence — small edge is acceptable
5-7	7%	Moderate confidence — need larger buffer
1-4	12%+	Low confidence — only bet on large mispricings

Kelly Criterion Preview

Once you’ve found an edge, how much should you bet? The Kelly criterion gives the mathematically optimal fraction of your bankroll:

def kelly_fraction(probability: float, market_price: float) -> float:
    """Calculate optimal bet size as a fraction of bankroll.

    This is full Kelly — most practitioners use half-Kelly or quarter-Kelly
    to reduce variance.
    """
    if probability <= market_price:
        return 0.0  # no edge

    # Odds offered by the market
    odds = (1 - market_price) / market_price

    # Kelly formula: f = (p * (odds + 1) - 1) / odds
    f = (probability * (odds + 1) - 1) / odds

    return max(0.0, min(f, 1.0))  # clamp to [0, 1]

# Example: you estimate 65% probability, market is at 50%
fraction = kelly_fraction(0.65, 0.50)
# fraction ≈ 0.30 — Kelly says bet 30% of bankroll
# Half-Kelly (safer): 0.15
# Quarter-Kelly (conservative): 0.075

Full Kelly is aggressive. Most successful traders use half-Kelly (multiply by 0.5) or quarter-Kelly (multiply by 0.25) to reduce the risk of large drawdowns. For a comprehensive treatment of position sizing and bankroll management, see the Risk Management Guide (coming soon).

Strategy Types

Different market conditions call for different strategies. A well-designed agent can switch between strategies or run multiple strategies in parallel.

Momentum Strategy

Bet in the direction of recent price movement. If a market has been trending toward Yes, momentum says it will continue.

def momentum_signal(price_history: list[float], lookback: int = 10) -> dict:
    """Generate a momentum signal from recent price history.

    Args:
        price_history: List of recent Yes prices (oldest first)
        lookback: Number of periods to analyze

    Returns:
        Signal with direction and strength.
    """
    if len(price_history) < lookback:
        return {"direction": "none", "strength": 0.0}

    recent = price_history[-lookback:]
    price_change = recent[-1] - recent[0]
    avg_price = sum(recent) / len(recent)

    # Normalize strength to [-1, 1]
    strength = max(-1.0, min(1.0, price_change / max(avg_price, 0.01)))

    if strength > 0.05:
        direction = "yes"
    elif strength < -0.05:
        direction = "no"
    else:
        direction = "none"

    return {"direction": direction, "strength": abs(strength)}

Best for: Markets with clear information cascades — elections after major poll releases, crypto markets after regulatory news.

Contrarian Strategy

Bet against the crowd when your signals disagree with the market direction. This works when markets overshoot on emotion.

def contrarian_signal(market_price: float, sentiment_score: float,
                      llm_probability: float, threshold: float = 0.15) -> dict:
    """Contrarian: bet against the crowd when fundamentals disagree.

    The logic: if sentiment is extremely bullish but the LLM analysis
    says the probability should be lower, the market may be overpriced.
    """
    sentiment_implied = (sentiment_score + 1.0) / 2.0  # convert to probability

    # Crowd-fundamental divergence
    crowd_optimism = sentiment_implied - llm_probability

    if crowd_optimism > threshold:
        # Crowd too bullish → bet No
        return {"direction": "no", "strength": crowd_optimism}
    elif crowd_optimism < -threshold:
        # Crowd too bearish → bet Yes
        return {"direction": "yes", "strength": abs(crowd_optimism)}
    else:
        return {"direction": "none", "strength": 0.0}

Best for: Markets driven by social media hype — celebrity predictions, meme-coin adjacent markets.

Event-Driven Strategy

React to specific news events that should move market prices. The agent monitors news feeds and trades when a catalyst appears.

Try the starter bot: The Kalshi News Bot implements this pattern with Claude analysis and one-click Railway deploy.

async def event_driven_scan(markets: list[dict],
                             news_signals: list[Signal]) -> list[dict]:
    """Scan for markets affected by recent news events."""
    opportunities = []

    for market in markets:
        # Check if any recent news is relevant to this market
        relevant_news = [
            s for s in news_signals
            if any(keyword in s.text.lower()
                   for keyword in market.get("keywords", []))
        ]

        if not relevant_news:
            continue

        # Use LLM to assess impact
        impact = client.messages.create(
            model="claude-haiku-4-5-20251001",
            max_tokens=256,
            messages=[{
                "role": "user",
                "content": f"""How does this news affect this prediction market?

Market: {market['question']}
Current price: {market['yes_price']}

News:
{chr(10).join(s.text[:150] for s in relevant_news[:5])}

Respond with JSON:
{{"impact": "bullish"|"bearish"|"neutral", "magnitude": 0.0-1.0,
  "new_probability": 0.XX, "reasoning": "brief"}}"""
            }]
        )

        result = json.loads(impact.content[0].text)
        if result["magnitude"] > 0.3:  # significant impact
            opportunities.append({
                "market": market,
                "news": relevant_news,
                "impact": result
            })

    return opportunities

Best for: Markets tied to scheduled events — earnings calls, elections, policy decisions, court rulings.

Arbitrage Strategy (Preview)

Arbitrage exploits pricing differences for the same outcome across platforms. For example, if Polymarket prices “Yes” at $0.60 and Kalshi prices the same outcome at $0.55, you can buy Yes on Kalshi and No on Polymarket for a risk-free profit.

def find_arbitrage(polymarket_yes: float, kalshi_yes: float,
                   poly_fee: float = 0.02, kalshi_fee: float = 0.07) -> dict:
    """Check for cross-platform arbitrage opportunity."""
    # Cost to buy Yes on cheaper platform + No on expensive platform
    # should be less than $1.00 for arbitrage to exist

    # Strategy 1: Buy Yes on Kalshi, No on Polymarket
    cost_1 = kalshi_yes + (1 - polymarket_yes) + poly_fee + kalshi_fee
    profit_1 = 1.0 - cost_1

    # Strategy 2: Buy Yes on Polymarket, No on Kalshi
    cost_2 = polymarket_yes + (1 - kalshi_yes) + poly_fee + kalshi_fee
    profit_2 = 1.0 - cost_2

    best = max(profit_1, profit_2)

    return {
        "has_opportunity": best > 0,
        "best_profit": round(best, 4),
        "strategy": "kalshi_yes_poly_no" if profit_1 > profit_2
                    else "poly_yes_kalshi_no",
    }

Cross-platform arbitrage is the most complex strategy to implement correctly because of settlement timing, fee structures, and resolution differences between platforms. For the complete treatment, see the Cross-Platform Arbitrage Guide (coming soon).

Tool Integration Patterns

Three tools in the ecosystem are purpose-built for the intelligence layer. Each serves a different role.

OpenClaw: Intelligence as Skills

OpenClaw is an open-source agent framework with a skill system. You can wrap your intelligence logic as OpenClaw skills, making them composable and reusable.

# Example: Register a market analysis skill with OpenClaw
# This is a simplified pattern — see OpenClaw docs for full skill API

class MarketAnalysisSkill:
    """OpenClaw skill that evaluates a prediction market."""

    name = "market_analysis"
    description = "Evaluate a prediction market using LLM analysis and sentiment"

    async def execute(self, market_id: str, platform: str = "polymarket") -> dict:
        # Fetch market data via Layer 3
        market = await fetch_market(market_id, platform)

        # Run intelligence pipeline
        llm_result = evaluate_market_structured(
            market["question"], market["yes_price"], market["description"]
        )

        signals = await fetch_twitter_signals(market["question"])
        sentiment = await score_signals(signals, market["question"])

        # Aggregate
        aggregator = SignalAggregator()
        aggregator.add("llm", llm_result.probability,
                       llm_result.confidence / 10, weight=3.0)
        aggregator.add("sentiment", (sentiment + 1) / 2, 0.5, weight=1.0)

        result = aggregator.aggregate()

        # Edge detection
        edge = calculate_edge(result["probability"], market["yes_price"],
                              result["confidence"])

        return {"analysis": result, "edge": edge, "market": market}

OpenClaw’s memory system also lets your agent learn from past trades, storing which markets and strategies performed well.

Polyseer: Multi-Agent Bayesian Analysis

Polyseer provides a ready-made intelligence pipeline with multi-agent architecture and Bayesian probability aggregation. Instead of building everything from scratch, you can use Polyseer as your primary analysis engine:

import httpx

async def get_polyseer_analysis(market_id: str) -> dict | None:
    """Fetch Polyseer's multi-agent analysis for a market."""
    async with httpx.AsyncClient() as client:
        resp = await client.get(
            f"https://api.polyseer.com/v1/analysis/{market_id}",
            headers={"Authorization": f"Bearer {POLYSEER_API_KEY}"}
        )
        if resp.status_code != 200:
            return None

        data = resp.json()
        return {
            "probability": data["aggregated_probability"],
            "confidence": data["confidence_score"],
            "agent_analyses": data["individual_agents"],
            "methodology": data["methodology"]
        }

Polyseer is particularly useful as one signal in your aggregator — it provides an independent Bayesian estimate that you can combine with your own LLM analysis and sentiment signals.

Predly: Mispricing Detection

Predly specializes in detecting mispricings. Rather than building your own edge detection from scratch, you can use Predly’s signals as a filter — only analyze markets where Predly has flagged a potential mispricing:

async def get_predly_alerts(min_confidence: float = 0.7) -> list[dict]:
    """Fetch current mispricing alerts from Predly."""
    async with httpx.AsyncClient() as client:
        resp = await client.get(
            "https://api.predly.ai/v1/alerts",
            headers={"Authorization": f"Bearer {PREDLY_API_KEY}"},
            params={"min_confidence": min_confidence}
        )
        return resp.json().get("alerts", [])

async def predly_filtered_pipeline():
    """Only run full analysis on markets Predly flags."""
    alerts = await get_predly_alerts(min_confidence=0.75)

    for alert in alerts:
        # Predly found a mispricing — run full analysis to confirm
        analysis = await research_and_analyze(
            alert["market_question"], alert["current_price"]
        )
        edge = calculate_edge(
            analysis.probability, alert["current_price"],
            analysis.confidence / 10
        )

        if edge["should_bet"]:
            print(f"Confirmed: {alert['market_question']}")
            print(f"  Predly says: {alert['predicted_direction']}")
            print(f"  Our analysis: {edge['direction']} "
                  f"with {edge['net_edge_after_fees']:.1%} edge")

This two-stage approach (Predly scans → your agent confirms) reduces LLM API costs by only running expensive analysis on promising opportunities.

Backtesting Basics

Never deploy real money on an untested strategy. Backtesting replays historical market data through your strategy logic to see how it would have performed.

Simple Historical Replay

from dataclasses import dataclass

@dataclass
class BacktestTrade:
    market_id: str
    direction: str      # "yes" or "no"
    entry_price: float
    exit_price: float   # 1.0 if correct, 0.0 if wrong
    size: float
    profit: float

def backtest_strategy(historical_markets: list[dict],
                      strategy_fn, initial_bankroll: float = 1000.0) -> dict:
    """Run a strategy against historical market data.

    Args:
        historical_markets: List of resolved markets with price history
        strategy_fn: Function that takes market data and returns a trade decision
        initial_bankroll: Starting capital

    Returns:
        Performance summary.
    """
    bankroll = initial_bankroll
    trades: list[BacktestTrade] = []

    for market in historical_markets:
        decision = strategy_fn(market)

        if decision["direction"] == "none":
            continue

        # Size the position (quarter-Kelly for safety)
        fraction = kelly_fraction(
            decision["probability"], market["entry_price"]
        ) * 0.25
        size = bankroll * fraction

        if size < 1.0:  # minimum trade size
            continue

        # Determine outcome
        correct = (
            (decision["direction"] == "yes" and market["resolved_yes"]) or
            (decision["direction"] == "no" and not market["resolved_yes"])
        )

        entry = market["entry_price"] if decision["direction"] == "yes" \
                else 1 - market["entry_price"]
        profit = (1.0 - entry) * size if correct else -entry * size

        trades.append(BacktestTrade(
            market_id=market["id"],
            direction=decision["direction"],
            entry_price=entry,
            exit_price=1.0 if correct else 0.0,
            size=size,
            profit=profit
        ))

        bankroll += profit

    # Calculate metrics
    if not trades:
        return {"trades": 0, "message": "No trades generated"}

    wins = sum(1 for t in trades if t.profit > 0)
    total_profit = sum(t.profit for t in trades)
    max_drawdown = _calculate_max_drawdown(trades, initial_bankroll)

    return {
        "trades": len(trades),
        "wins": wins,
        "win_rate": wins / len(trades),
        "total_profit": round(total_profit, 2),
        "return_pct": round(total_profit / initial_bankroll * 100, 2),
        "max_drawdown_pct": round(max_drawdown * 100, 2),
        "final_bankroll": round(bankroll, 2),
    }

def _calculate_max_drawdown(trades: list[BacktestTrade],
                             initial: float) -> float:
    peak = initial
    max_dd = 0.0
    current = initial
    for t in trades:
        current += t.profit
        peak = max(peak, current)
        dd = (peak - current) / peak
        max_dd = max(max_dd, dd)
    return max_dd

Forward-Testing

Kalshi provides a demo environment where you can test with paper money. Use it to validate your strategy in real market conditions before risking capital. Polymarket does not have a demo mode, so use historical data for Polymarket backtesting.

For a comprehensive backtesting framework including overfitting prevention, proper train/test splits, and performance metrics, see the Backtesting and Strategy Validation Guide (coming soon).

Putting It All Together

Here is a complete agent loop that combines LLM analysis, sentiment, Bayesian updating, signal aggregation, and edge detection:

import asyncio

async def agent_loop(markets: list[dict], interval_seconds: int = 300):
    """Main agent decision loop. Runs continuously."""

    while True:
        for market in markets:
            try:
                # 1. OBSERVE: Gather signals
                twitter = await fetch_twitter_signals(market["question"])
                reddit = await fetch_reddit_signals(
                    "predictionmarkets", market["question"])
                news = await fetch_news_signals(market["question"])

                # 2. ANALYZE: Run intelligence pipeline
                # 2a. LLM analysis
                llm = evaluate_market_structured(
                    market["question"], market["yes_price"],
                    market.get("description", "")
                )

                # 2b. Sentiment scoring
                social_sentiment = await score_signals(
                    twitter + reddit, market["question"])
                news_sentiment = await score_signals(news, market["question"])

                # 2c. Bayesian update from evidence
                evidence = []
                for signal in (twitter + reddit + news)[:10]:
                    lh = estimate_likelihood(signal.text, market["question"])
                    evidence.append(lh)
                bayesian_prob = bayesian_update(market["yes_price"], evidence)

                # 2d. Aggregate all signals
                agg = SignalAggregator()
                agg.add("llm", llm.probability, llm.confidence / 10, weight=3.0)
                agg.add("social", (social_sentiment + 1) / 2, 0.5, weight=1.0)
                agg.add("news", (news_sentiment + 1) / 2, 0.6, weight=1.5)
                agg.add("bayesian", bayesian_prob, 0.7, weight=2.0)

                result = agg.aggregate()

                # 3. DECIDE: Edge detection
                edge = calculate_edge(
                    result["probability"], market["yes_price"],
                    result["confidence"]
                )

                if edge["should_bet"]:
                    size = kelly_fraction(
                        result["probability"], market["yes_price"]
                    ) * 0.25  # quarter-Kelly

                    print(f"TRADE SIGNAL: {market['question']}")
                    print(f"  Direction: {edge['direction']}")
                    print(f"  Edge: {edge['net_edge_after_fees']:.1%}")
                    print(f"  Size: {size:.1%} of bankroll")

                    # 4. EXECUTE: Send to Layer 3
                    # await execute_trade(market, edge["direction"], size)

            except Exception as e:
                print(f"Error analyzing {market.get('question', 'unknown')}: {e}")
                continue

        await asyncio.sleep(interval_seconds)

The execute_trade call is commented out — uncomment it only after thorough backtesting and paper trading.

Common Pitfalls

Overfitting to recent data. A strategy that would have crushed the last 10 markets may fail on the next 10. Always use out-of-sample testing and be suspicious of strategies with win rates above 80%.

Ignoring fees in edge calculations. Polymarket charges roughly 2% in fees, Kalshi up to 7%. A 5% raw edge on Kalshi is actually negative after fees. Always calculate net edge.

LLM hallucination in probability estimates. LLMs can produce confident-sounding analysis with made-up facts. Always combine LLM output with data from verifiable sources (APIs, price feeds). Never rely on a single LLM call for a trading decision.

Not accounting for LLM knowledge cutoffs. LLMs don’t know what happened yesterday. The research → analyze → decide pattern in this guide solves this by feeding real-time data into the analysis stage.

Prompt injection risks. If your agent ingests text from social media or user-generated content, malicious actors could craft posts designed to manipulate your LLM’s analysis. Use input sanitization and see the Security Best Practices Guide for mitigation strategies.

Betting too large. Full Kelly sizing leads to large drawdowns. Start with quarter-Kelly and increase only after you have statistical evidence that your edge is real (100+ trades minimum).

Where This Fits in the Agent Betting Stack

This guide covers Layer 4 (Intelligence) of the Agent Betting Stack. It assumes you have the other layers in place:

Layer 1 — Identity: Moltbook Identity Guide — register your agent and build reputation
Layer 2 — Wallet: Wallet Comparison Guide — fund your agent with the right wallet
Layer 3 — Trading: Polymarket API Guide | Kalshi API Guide — execute trades
Cross-cutting: Security Best Practices — protect keys, prevent prompt injection

OpenClaw — Agent framework with skill system and memory
Polyseer — Multi-agent Bayesian analysis platform
Predly — AI-powered mispricing detection
Dome — Unified prediction market data API

Official Resources

Anthropic API Documentation — Claude models used in this guide
Agent Betting Glossary — Every term defined
Tool Directory — All tools in the ecosystem