Polymarket enforces all API rate limits via Cloudflare throttling — requests over the limit are queued, not rejected. This guide covers every endpoint’s limit as of March 2026, the burst vs sustained distinction for trading endpoints, the Builder Program tiers, and production-ready retry code for autonomous agents.

How Polymarket Rate Limiting Works

Polymarket’s rate limiting is throttle-based, not reject-based. When you exceed the configured rate for any endpoint, Cloudflare queues your requests and introduces latency rather than immediately returning HTTP 429. This is an important distinction for agent builders:

Traditional rate limiting:   Request → 429 → Retry → Success
Polymarket throttling:       Request → Queued → Delayed response → Success
Polymarket hard limit:       Request → Queued → Still over limit → 429

Three things to know:

  1. Throttling comes first. Your requests slow down before they fail. If your agent suddenly sees response times spike from 50ms to 500ms, you’re being throttled.
  2. Burst allowances exist. Trading endpoints allow short spikes above the sustained rate — you can fire off a burst of orders and settle into a lower average.
  3. Sliding windows. All limits use sliding time windows (10 seconds or 10 minutes), not fixed calendar windows. There’s no “reset at the top of the minute.”

Rate Limit Headers

Every response from the Polymarket CLOB API includes rate limit headers. Monitor these to stay ahead of 429 errors.

X-RateLimit-Limit

The maximum number of requests allowed in the current time window.

X-RateLimit-Remaining

How many requests you have left in the current window. When this reaches 0, subsequent requests will return 429.

X-RateLimit-Reset

Unix timestamp (in seconds) when the rate limit window resets and your allowance is restored.

Reading rate limit headers in Python:

import requests

response = requests.get(
    "https://clob.polymarket.com/price",
    params={"token_id": "<token-id>", "side": "BUY"}
)

limit = response.headers.get("X-RateLimit-Limit")
remaining = response.headers.get("X-RateLimit-Remaining")
reset_at = response.headers.get("X-RateLimit-Reset")

print(f"Limit: {limit}")
print(f"Remaining: {remaining}")
print(f"Resets at: {reset_at}")

Proactive throttling: Rather than waiting for a 429, check X-RateLimit-Remaining after each request and slow down when it gets low. A simple rule: if remaining is below 20% of the limit, add a small delay before the next request.


Complete Rate Limit Tables (March 2026)

These tables reflect the current limits from Polymarket’s official documentation.

General Rate Limits

EndpointLimitWindow
General (all APIs)15,000 req10 seconds
Health check (“OK”)100 req10 seconds

The 15,000/10s general limit is the outer boundary. Individual API sections have their own lower limits that apply first.

CLOB API Rate Limits

The CLOB (Central Limit Order Book) is where all trading happens. These are the limits that matter most for autonomous agents.

General CLOB endpoints:

EndpointLimitWindow
CLOB (general)9,000 req10 seconds
GET balance-allowance200 req10 seconds
UPDATE balance-allowance50 req10 seconds

Market data endpoints:

EndpointLimitWindow
GET /book (single)1,500 req10 seconds
POST /books (batch)500 req10 seconds
GET /price (single)1,500 req10 seconds
POST /prices (batch)500 req10 seconds
GET /midprice (single)1,500 req10 seconds
POST /midprices (batch)500 req10 seconds

Ledger endpoints:

EndpointLimitWindow
/trades, /orders, /notifications, /order900 req10 seconds
/data/orders500 req10 seconds
/data/trades500 req10 seconds
/notifications125 req10 seconds

Price history & market info:

EndpointLimitWindow
Price history1,000 req10 seconds
Market tick size200 req10 seconds

Authentication:

EndpointLimitWindow
API key operations100 req10 seconds

CLOB Trading Endpoints (Burst + Sustained)

Trading endpoints are the only ones with dual-tier enforcement. Both limits apply simultaneously.

EndpointBurst Limit (10s)Sustained Limit (10min)Effective Avg
POST /order3,500 (500/s)36,000 (60/s)60/s
DELETE /order3,000 (300/s)30,000 (50/s)50/s
POST /orders (batch)1,000 (100/s)15,000 (25/s)25/s
DELETE /orders (batch)1,000 (100/s)15,000 (25/s)25/s
DELETE /cancel-all250 (25/s)6,000 (10/s)10/s
DELETE /cancel-market-orders1,000 (100/s)1,500 (25/s)2.5/s

How to read this table: Your agent can burst to 3,500 order placements in a 10-second window (useful for entering multiple positions quickly), but over a 10-minute window, you’re limited to 36,000 total — an average of 60 per second. If you burn your burst budget, you need to slow down or you’ll hit the sustained limit.

The /cancel-market-orders endpoint has a notably tight sustained limit (1,500/10min) compared to its burst allowance. If your agent needs to cancel orders across many markets frequently, use the batch /orders delete endpoint instead.

Gamma API Rate Limits

The Gamma API provides market metadata, events, tags, and search. These are read-only endpoints agents use for market discovery.

EndpointLimitWindow
Gamma (general)4,000 req10 seconds
GET /events500 req10 seconds
GET /markets300 req10 seconds
GET /markets + /events listing900 req10 seconds
GET comments200 req10 seconds
Tags200 req10 seconds
Search350 req10 seconds

Data API Rate Limits

The Data API covers trades, positions, and analytics data.

EndpointLimitWindow
Data API (general)1,000 req10 seconds
/trades200 req10 seconds
/positions150 req10 seconds
/closed-positions150 req10 seconds
Health check (“OK”)100 req10 seconds

Other API Rate Limits

EndpointLimitWindow
Relayer /submit25 req1 minute
User PNL API200 req10 seconds

The Relayer has the tightest limit on the platform. If your agent uses gasless trading via the Builder Program, plan your submission cadence carefully — 25 per minute is about one every 2.4 seconds.


Builder Program Tiers

The Polymarket Builder Program uses a tiered system that directly affects rate limits. Higher tiers unlock more throughput.

TierApprovalRate LimitsExtras
UnverifiedNone — start immediatelyDefault limits (tables above)Gasless trading, gas-subsidized Relayer transactions (daily limit)
VerifiedManual approval requiredIncreased limits over UnverifiedHigher daily Relayer limit, weekly rewards, engineering support
PartnerEnterprise applicationHighest limitsRevenue sharing, marketing promotion, priority access

To upgrade: Email [email protected] with your Builder API Key, use case, expected volume, and relevant links (app, docs, X profile).

If you’re building an autonomous agent that trades consistently, getting to Verified tier should be a priority. The default Unverified limits are generous for development and testing, but production agents monitoring multiple markets will hit them.

Important: If you only need more Relayer transaction capacity for your own wallet (not routing orders for others), you can get unlimited daily Relay transactions by obtaining a Relayer API key without upgrading tiers.

For detailed Relayer Client setup including SDK packages and BuilderConfig code examples, see the Polymarket API Guide — Builder Program.


Handling 429 Errors

When throttling isn’t enough to keep you within limits, Polymarket returns HTTP 429:

{
  "error": "Too Many Requests"
}

Exponential Backoff with Jitter (Python)

This is the pattern every production agent should implement. The tenacity library handles this cleanly with py_clob_client:

import time
import random
from tenacity import retry, stop_after_attempt, wait_exponential_jitter
from py_clob_client.client import ClobClient

client = ClobClient(
    "https://clob.polymarket.com",
    key="<your-private-key>",
    chain_id=137
)
client.set_api_creds(client.create_or_derive_api_creds())

@retry(
    stop=stop_after_attempt(5),
    wait=wait_exponential_jitter(
        initial=1,      # Start at 1 second
        max=60,          # Cap at 60 seconds
        jitter=2         # Add up to 2s random jitter
    )
)
def get_orderbook_safe(token_id: str):
    """Fetch orderbook with automatic retry on throttle."""
    return client.get_order_book(token_id)

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential_jitter(initial=2, max=30, jitter=1)
)
def place_order_safe(order):
    """Place order with retry — fewer attempts, longer initial wait."""
    return client.post_order(order)

Why jitter matters: Without jitter, if 10 agents all hit 429 at the same time, they all retry at the same time and hit 429 again. Jitter spreads retries across a random window, breaking the thundering herd pattern.

Manual Retry (No Dependencies)

If you don’t want to use tenacity:

import time
import random

def retry_with_backoff(fn, max_retries=5, base_delay=1.0):
    """Execute fn with exponential backoff + jitter on failure."""
    for attempt in range(max_retries):
        try:
            return fn()
        except Exception as e:
            if "429" in str(e) or "Too Many Requests" in str(e):
                delay = min(base_delay * (2 ** attempt), 60)
                jitter = random.uniform(0, delay * 0.5)
                time.sleep(delay + jitter)
            else:
                raise
    raise Exception(f"Failed after {max_retries} retries")

# Usage
book = retry_with_backoff(
    lambda: client.get_order_book("TOKEN_ID_HERE")
)

Detecting Throttling Before 429

Smart agents detect throttling before hitting hard limits. Monitor your response times:

import time

class ThrottleDetector:
    def __init__(self, baseline_ms=100, threshold_multiplier=3):
        self.baseline_ms = baseline_ms
        self.threshold = baseline_ms * threshold_multiplier
        self.recent_latencies = []

    def record(self, latency_ms: float):
        self.recent_latencies.append(latency_ms)
        if len(self.recent_latencies) > 20:
            self.recent_latencies.pop(0)

    @property
    def is_throttled(self) -> bool:
        if len(self.recent_latencies) < 3:
            return False
        avg = sum(self.recent_latencies[-5:]) / min(5, len(self.recent_latencies))
        return avg > self.threshold

    def suggested_delay(self) -> float:
        """Return seconds to wait before next request."""
        if not self.is_throttled:
            return 0
        avg = sum(self.recent_latencies[-5:]) / 5
        return min((avg / self.baseline_ms) * 0.5, 10.0)

# Usage in your agent loop
detector = ThrottleDetector(baseline_ms=80)

start = time.time()
book = client.get_order_book(token_id)
latency = (time.time() - start) * 1000

detector.record(latency)
if detector.is_throttled:
    time.sleep(detector.suggested_delay())

Rate Limit Budgeting for Agents

An autonomous agent typically needs to:

  1. Scan markets — Gamma API calls to find opportunities
  2. Check prices — CLOB orderbook/price queries
  3. Check positions — Data API position tracking
  4. Execute trades — CLOB order placement/cancellation

Here’s a rate budget for an agent monitoring 50 markets:

┌─────────────────────────────────────────────────┐
│  AGENT RATE BUDGET (50 markets, per 10 seconds) │
├─────────────────────┬───────────┬───────────────┤
│  Task               │  Requests │  Limit        │
├─────────────────────┼───────────┼───────────────┤
│  Gamma market scan  │  50       │  300/10s      │
│  Orderbook checks   │  50       │  1,500/10s    │
│  Price checks       │  50       │  1,500/10s    │
│  Position tracking  │  10       │  150/10s      │
│  Order placement    │  5        │  3,500/10s    │
│  Order cancellation │  5        │  3,000/10s    │
├─────────────────────┼───────────┼───────────────┤
│  TOTAL              │  170      │  9,000/10s    │
└─────────────────────┴───────────┴───────────────┘

At 170 requests per 10-second cycle, this agent uses under 2% of the CLOB general limit. You have significant headroom — the constraint is usually on specific endpoints, not the general cap.

Optimization: Use batch endpoints. Instead of 50 individual GET /book calls, use a single POST /books call. This counts as 1 request against the batch limit (500/10s) instead of 50 against the single limit (1,500/10s).

# Bad: 50 requests against GET /book (1,500/10s limit)
for token_id in token_ids:
    book = client.get_order_book(token_id)

# Good: 1 request against POST /books (500/10s limit)
books = client.get_order_books(token_ids)

Use WebSocket Instead of Polling

The single best way to reduce rate limit pressure is to stop polling. Polymarket’s WebSocket API streams real-time orderbook updates and trades — no polling required.

import json
import websockets

async def stream_orderbook(token_ids: list[str]):
    """Stream real-time orderbook updates via WebSocket."""
    uri = "wss://ws-subscriptions-clob.polymarket.com/ws/market"
    async with websockets.connect(uri) as ws:
        subscribe = {
            "type": "subscribe",
            "channel": "market",
            "assets_id": token_ids
        }
        await ws.send(json.dumps(subscribe))

        async for message in ws:
            data = json.loads(message)
            yield data

As of the January 2026 changelog update, the 100 token subscription limit has been removed from the Markets channel — you can subscribe to as many token IDs as your agent needs.

Agent pattern: Use WebSocket for real-time data (orderbooks, trades, price changes) and REST only for one-time lookups (market metadata, position snapshots, order placement).


Rate Limits vs Kalshi

For agents that trade across both platforms, here’s how the limits compare:

MetricPolymarketKalshi
EnforcementCloudflare throttle (queue then 429)Hard reject (immediate 429)
General limit15,000 / 10sVaries by endpoint
Order placement3,500 / 10s burstLower throughput
AuthenticationEIP-712 + HMAC (L1/L2)RSA-PSS or Bearer token
WebSocketUnlimited token subscriptionsPer-connection limits
Upgrade pathBuilder Program tiersContact sales

Polymarket’s throttle-first approach is more forgiving for agents — your requests degrade gracefully instead of failing hard. See the Prediction Market API Reference for the full cross-platform comparison.


Frequently Asked Questions

What is the Polymarket API rate limit?

Polymarket enforces rate limits via Cloudflare throttling with a general cap of 15,000 requests per 10 seconds. Specific endpoints have lower limits — CLOB general is 9,000/10s, Gamma API is 4,000/10s, and Data API is 1,000/10s. Trading endpoints like POST /order have dual-tier limits: 3,500/10s burst and 36,000 per 10 minutes sustained. See the full rate limit tables above for every endpoint.

What happens when you hit a Polymarket rate limit?

Polymarket uses Cloudflare throttling, which means requests over the limit are delayed and queued rather than immediately rejected with a 429 error. This is different from hard rate limiting — your requests slow down before they fail. If throttling is insufficient, you receive HTTP 429 Too Many Requests. See How Polymarket Rate Limiting Works for the full breakdown.

How do you handle Polymarket 429 errors in Python?

Implement exponential backoff with jitter using the tenacity library or a custom retry decorator. Start with a 1-second delay, double on each retry up to 60 seconds, and add random jitter to prevent thundering herd. The py_clob_client SDK does not handle rate limiting automatically — you need to implement retry logic yourself. See Handling 429 Errors for production-ready code.

What are Polymarket burst vs sustained rate limits?

Trading endpoints have two limit tiers. Burst limits allow short spikes over 10-second windows (e.g., 3,500 POST /order requests in 10 seconds). Sustained limits enforce a lower average over 10-minute windows (e.g., 36,000 POST /order in 10 minutes, averaging 60/s). Both limits apply simultaneously — you can spike briefly but must stay under the sustained average. See the CLOB Trading Endpoints table for all trading limits.

Does the Polymarket Builder Program increase rate limits?

Yes. The Builder Program has three tiers: Unverified (default, no approval required), Verified (manual approval, higher throughput), and Partner (enterprise tier). Higher tiers unlock increased rate limits, gasless trading via Safe/Proxy wallets, weekly rewards, and priority support. Contact [email protected] to upgrade. See Builder Program Tiers for details.


What’s Next


This guide is maintained by AgentBets.ai. Found an error or API change we missed? Let us know on Twitter.

Not financial advice. Built for builders.