Layer 3 — Trading

Polymarket Rate Limits Guide: Every Endpoint, Burst Rule & Retry Strategy (March 2026)

Q: "What is the Polymarket API rate limit?"

"Polymarket enforces rate limits via Cloudflare throttling with a general cap of 15,000 requests per 10 seconds. Specific endpoints have lower limits — CLOB general is 9,000/10s, Gamma API is 4,000/10s, and Data API is 1,000/10s. Trading endpoints like POST /order have dual-tier limits: 3,500/10s burst and 36,000 per 10 minutes sustained."

Q: "What happens when you hit a Polymarket rate limit?"

"Polymarket uses Cloudflare throttling, which means requests over the limit are delayed and queued rather than immediately rejected with a 429 error. This is different from hard rate limiting — your requests slow down before they fail. If throttling is insufficient, you receive HTTP 429 Too Many Requests."

Q: "How do you handle Polymarket 429 errors in Python?"

"Implement exponential backoff with jitter using the tenacity library or a custom retry decorator. Start with a 1-second delay, double on each retry up to 60 seconds, and add random jitter to prevent thundering herd. The py_clob_client SDK does not handle rate limiting automatically — you need to implement retry logic yourself."

Q: "What are Polymarket burst vs sustained rate limits?"

"Trading endpoints have two limit tiers. Burst limits allow short spikes over 10-second windows (e.g., 3,500 POST /order requests in 10 seconds). Sustained limits enforce a lower average over 10-minute windows (e.g., 36,000 POST /order in 10 minutes, averaging 60/s). Both limits apply simultaneously — you can spike briefly but must stay under the sustained average."

Q: "Does the Polymarket Builder Program increase rate limits?"

"Yes. The Builder Program has three tiers: Unverified (default, no approval required), Verified (manual approval, higher throughput), and Partner (enterprise tier). Higher tiers unlock increased rate limits, gasless trading via Safe/Proxy wallets, weekly rewards, and priority support. Contact builder@polymarket.com to upgrade."

By Rahim February 28, 2026May 25, 2026 · 14 min read

Complete Polymarket API rate limit reference for March 2026. Per-endpoint tables for CLOB, Gamma, Data API, and trading endpoints with burst vs sustained limits, 429 error handling, and production retry code.

Polymarket Rate Limits Guide: Every Endpoint, Burst Rule & Retry Strategy (March 2026)

Polymarket enforces all API rate limits via Cloudflare throttling — requests over the limit are queued, not rejected. This guide covers every endpoint’s limit as of March 2026, the burst vs sustained distinction for trading endpoints, the Builder Program tiers, and production-ready retry code for autonomous agents.

How Polymarket Rate Limiting Works

Polymarket’s rate limiting is throttle-based, not reject-based. When you exceed the configured rate for any endpoint, Cloudflare queues your requests and introduces latency rather than immediately returning HTTP 429. This is an important distinction for agent builders:

Traditional rate limiting:   Request → 429 → Retry → Success
Polymarket throttling:       Request → Queued → Delayed response → Success
Polymarket hard limit:       Request → Queued → Still over limit → 429

Three things to know:

Throttling comes first. Your requests slow down before they fail. If your agent suddenly sees response times spike from 50ms to 500ms, you’re being throttled.
Burst allowances exist. Trading endpoints allow short spikes above the sustained rate — you can fire off a burst of orders and settle into a lower average.
Sliding windows. All limits use sliding time windows (10 seconds or 10 minutes), not fixed calendar windows. There’s no “reset at the top of the minute.”

Rate Limit Headers

Every response from the Polymarket CLOB API includes rate limit headers. Monitor these to stay ahead of 429 errors.

X-RateLimit-Limit

The maximum number of requests allowed in the current time window.

X-RateLimit-Remaining

How many requests you have left in the current window. When this reaches 0, subsequent requests will return 429.

X-RateLimit-Reset

Unix timestamp (in seconds) when the rate limit window resets and your allowance is restored.

Reading rate limit headers in Python:

import requests

response = requests.get(
    "https://clob.polymarket.com/price",
    params={"token_id": "<token-id>", "side": "BUY"}
)

limit = response.headers.get("X-RateLimit-Limit")
remaining = response.headers.get("X-RateLimit-Remaining")
reset_at = response.headers.get("X-RateLimit-Reset")

print(f"Limit: {limit}")
print(f"Remaining: {remaining}")
print(f"Resets at: {reset_at}")

Proactive throttling: Rather than waiting for a 429, check X-RateLimit-Remaining after each request and slow down when it gets low. A simple rule: if remaining is below 20% of the limit, add a small delay before the next request.

Complete Rate Limit Tables (March 2026)

These tables reflect the current limits from Polymarket’s official documentation.

General Rate Limits

Endpoint	Limit	Window
General (all APIs)	15,000 req	10 seconds
Health check (“OK”)	100 req	10 seconds

The 15,000/10s general limit is the outer boundary. Individual API sections have their own lower limits that apply first.

CLOB API Rate Limits

The CLOB (Central Limit Order Book) is where all trading happens. These are the limits that matter most for autonomous agents.

General CLOB endpoints:

Endpoint	Limit	Window
CLOB (general)	9,000 req	10 seconds
GET balance-allowance	200 req	10 seconds
UPDATE balance-allowance	50 req	10 seconds

Market data endpoints:

Endpoint	Limit	Window
GET `/book` (single)	1,500 req	10 seconds
POST `/books` (batch)	500 req	10 seconds
GET `/price` (single)	1,500 req	10 seconds
POST `/prices` (batch)	500 req	10 seconds
GET `/midprice` (single)	1,500 req	10 seconds
POST `/midprices` (batch)	500 req	10 seconds

Ledger endpoints:

Endpoint	Limit	Window
`/trades`, `/orders`, `/notifications`, `/order`	900 req	10 seconds
`/data/orders`	500 req	10 seconds
`/data/trades`	500 req	10 seconds
`/notifications`	125 req	10 seconds

Price history & market info:

Endpoint	Limit	Window
Price history	1,000 req	10 seconds
Market tick size	200 req	10 seconds

Authentication:

Endpoint	Limit	Window
API key operations	100 req	10 seconds

CLOB Trading Endpoints (Burst + Sustained)

Trading endpoints are the only ones with dual-tier enforcement. Both limits apply simultaneously.

Endpoint	Burst Limit (10s)	Sustained Limit (10min)	Effective Avg
POST `/order`	3,500 (500/s)	36,000 (60/s)	60/s
DELETE `/order`	3,000 (300/s)	30,000 (50/s)	50/s
POST `/orders` (batch)	1,000 (100/s)	15,000 (25/s)	25/s
DELETE `/orders` (batch)	1,000 (100/s)	15,000 (25/s)	25/s
DELETE `/cancel-all`	250 (25/s)	6,000 (10/s)	10/s
DELETE `/cancel-market-orders`	1,000 (100/s)	1,500 (25/s)	2.5/s

How to read this table: Your agent can burst to 3,500 order placements in a 10-second window (useful for entering multiple positions quickly), but over a 10-minute window, you’re limited to 36,000 total — an average of 60 per second. If you burn your burst budget, you need to slow down or you’ll hit the sustained limit.

The /cancel-market-orders endpoint has a notably tight sustained limit (1,500/10min) compared to its burst allowance. If your agent needs to cancel orders across many markets frequently, use the batch /orders delete endpoint instead.

Gamma API Rate Limits

The Gamma API provides market metadata, events, tags, and search. These are read-only endpoints agents use for market discovery.

Endpoint	Limit	Window
Gamma (general)	4,000 req	10 seconds
GET `/events`	500 req	10 seconds
GET `/markets`	300 req	10 seconds
GET `/markets` + `/events` listing	900 req	10 seconds
GET comments	200 req	10 seconds
Tags	200 req	10 seconds
Search	350 req	10 seconds

Data API Rate Limits

The Data API covers trades, positions, and analytics data.

Endpoint	Limit	Window
Data API (general)	1,000 req	10 seconds
`/trades`	200 req	10 seconds
`/positions`	150 req	10 seconds
`/closed-positions`	150 req	10 seconds
Health check (“OK”)	100 req	10 seconds

Other API Rate Limits

Endpoint	Limit	Window
Relayer `/submit`	25 req	1 minute
User PNL API	200 req	10 seconds

The Relayer has the tightest limit on the platform. If your agent uses gasless trading via the Builder Program, plan your submission cadence carefully — 25 per minute is about one every 2.4 seconds.

Builder Program Tiers

The Polymarket Builder Program uses a tiered system that directly affects rate limits. Higher tiers unlock more throughput.

Tier	Approval	Rate Limits	Extras
Unverified	None — start immediately	Default limits (tables above)	Gasless trading, gas-subsidized Relayer transactions (daily limit)
Verified	Manual approval required	Increased limits over Unverified	Higher daily Relayer limit, weekly rewards, engineering support
Partner	Enterprise application	Highest limits	Revenue sharing, marketing promotion, priority access

To upgrade: Email [email protected] with your Builder API Key, use case, expected volume, and relevant links (app, docs, X profile).

If you’re building an autonomous agent that trades consistently, getting to Verified tier should be a priority. The default Unverified limits are generous for development and testing, but production agents monitoring multiple markets will hit them.

Important: If you only need more Relayer transaction capacity for your own wallet (not routing orders for others), you can get unlimited daily Relay transactions by obtaining a Relayer API key without upgrading tiers.

For detailed Relayer Client setup including SDK packages and BuilderConfig code examples, see the Polymarket API Guide — Builder Program.

Handling 429 Errors

When throttling isn’t enough to keep you within limits, Polymarket returns HTTP 429:

{
  "error": "Too Many Requests"
}

Exponential Backoff with Jitter (Python)

This is the pattern every production agent should implement. The tenacity library handles this cleanly with py-clob-client-v2:

import time
import random
from tenacity import retry, stop_after_attempt, wait_exponential_jitter
from py_clob_client_v2 import ClobClient

client = ClobClient(
    host="https://clob.polymarket.com",
    chain_id=137,
    key="<your-private-key>",
)
client.set_api_creds(client.create_or_derive_api_key())

@retry(
    stop=stop_after_attempt(5),
    wait=wait_exponential_jitter(
        initial=1,      # Start at 1 second
        max=60,          # Cap at 60 seconds
        jitter=2         # Add up to 2s random jitter
    )
)
def get_orderbook_safe(token_id: str):
    """Fetch orderbook with automatic retry on throttle."""
    return client.get_order_book(token_id)

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential_jitter(initial=2, max=30, jitter=1)
)
def place_order_safe(order):
    """Place order with retry — fewer attempts, longer initial wait."""
    return client.post_order(order)

Why jitter matters: Without jitter, if 10 agents all hit 429 at the same time, they all retry at the same time and hit 429 again. Jitter spreads retries across a random window, breaking the thundering herd pattern.

Manual Retry (No Dependencies)

If you don’t want to use tenacity:

import time
import random

def retry_with_backoff(fn, max_retries=5, base_delay=1.0):
    """Execute fn with exponential backoff + jitter on failure."""
    for attempt in range(max_retries):
        try:
            return fn()
        except Exception as e:
            if "429" in str(e) or "Too Many Requests" in str(e):
                delay = min(base_delay * (2 ** attempt), 60)
                jitter = random.uniform(0, delay * 0.5)
                time.sleep(delay + jitter)
            else:
                raise
    raise Exception(f"Failed after {max_retries} retries")

# Usage
book = retry_with_backoff(
    lambda: client.get_order_book("TOKEN_ID_HERE")
)

Detecting Throttling Before 429

Smart agents detect throttling before hitting hard limits. Monitor your response times:

import time

class ThrottleDetector:
    def __init__(self, baseline_ms=100, threshold_multiplier=3):
        self.baseline_ms = baseline_ms
        self.threshold = baseline_ms * threshold_multiplier
        self.recent_latencies = []

    def record(self, latency_ms: float):
        self.recent_latencies.append(latency_ms)
        if len(self.recent_latencies) > 20:
            self.recent_latencies.pop(0)

    @property
    def is_throttled(self) -> bool:
        if len(self.recent_latencies) < 3:
            return False
        avg = sum(self.recent_latencies[-5:]) / min(5, len(self.recent_latencies))
        return avg > self.threshold

    def suggested_delay(self) -> float:
        """Return seconds to wait before next request."""
        if not self.is_throttled:
            return 0
        avg = sum(self.recent_latencies[-5:]) / 5
        return min((avg / self.baseline_ms) * 0.5, 10.0)

# Usage in your agent loop
detector = ThrottleDetector(baseline_ms=80)

start = time.time()
book = client.get_order_book(token_id)
latency = (time.time() - start) * 1000

detector.record(latency)
if detector.is_throttled:
    time.sleep(detector.suggested_delay())

Rate Limit Budgeting for Agents

An autonomous agent typically needs to:

Scan markets — Gamma API calls to find opportunities
Check prices — CLOB orderbook/price queries
Check positions — Data API position tracking
Execute trades — CLOB order placement/cancellation

Here’s a rate budget for an agent monitoring 50 markets:

┌─────────────────────────────────────────────────┐
│  AGENT RATE BUDGET (50 markets, per 10 seconds) │
├─────────────────────┬───────────┬───────────────┤
│  Task               │  Requests │  Limit        │
├─────────────────────┼───────────┼───────────────┤
│  Gamma market scan  │  50       │  300/10s      │
│  Orderbook checks   │  50       │  1,500/10s    │
│  Price checks       │  50       │  1,500/10s    │
│  Position tracking  │  10       │  150/10s      │
│  Order placement    │  5        │  3,500/10s    │
│  Order cancellation │  5        │  3,000/10s    │
├─────────────────────┼───────────┼───────────────┤
│  TOTAL              │  170      │  9,000/10s    │
└─────────────────────┴───────────┴───────────────┘

At 170 requests per 10-second cycle, this agent uses under 2% of the CLOB general limit. You have significant headroom — the constraint is usually on specific endpoints, not the general cap.

Optimization: Use batch endpoints. Instead of 50 individual GET /book calls, use a single POST /books call. This counts as 1 request against the batch limit (500/10s) instead of 50 against the single limit (1,500/10s).

# Bad: 50 requests against GET /book (1,500/10s limit)
for token_id in token_ids:
    book = client.get_order_book(token_id)

# Good: 1 request against POST /books (500/10s limit)
books = client.get_order_books(token_ids)

Use WebSocket Instead of Polling

The single best way to reduce rate limit pressure is to stop polling. Polymarket’s WebSocket API streams real-time orderbook updates and trades — no polling required.

import json
import websockets

async def stream_orderbook(token_ids: list[str]):
    """Stream real-time orderbook updates via WebSocket."""
    uri = "wss://ws-subscriptions-clob.polymarket.com/ws/market"
    async with websockets.connect(uri) as ws:
        subscribe = {
            "type": "subscribe",
            "channel": "market",
            "assets_id": token_ids
        }
        await ws.send(json.dumps(subscribe))

        async for message in ws:
            data = json.loads(message)
            yield data

As of the January 2026 changelog update, the 100 token subscription limit has been removed from the Markets channel — you can subscribe to as many token IDs as your agent needs.

Agent pattern: Use WebSocket for real-time data (orderbooks, trades, price changes) and REST only for one-time lookups (market metadata, position snapshots, order placement).

Rate Limits vs Kalshi

For agents that trade across both platforms, here’s how the limits compare:

Metric	Polymarket	Kalshi
Enforcement	Cloudflare throttle (queue then 429)	Hard reject (immediate 429)
General limit	15,000 / 10s	Varies by endpoint
Order placement	3,500 / 10s burst	Lower throughput
Authentication	EIP-712 + HMAC (L1/L2)	RSA-PSS or Bearer token
WebSocket	Unlimited token subscriptions	Per-connection limits
Upgrade path	Builder Program tiers	Contact sales

Polymarket’s throttle-first approach is more forgiving for agents — your requests degrade gracefully instead of failing hard. See the Prediction Market API Reference for the full cross-platform comparison.

Frequently Asked Questions

What is the Polymarket API rate limit?

Polymarket enforces rate limits via Cloudflare throttling with a general cap of 15,000 requests per 10 seconds. Specific endpoints have lower limits — CLOB general is 9,000/10s, Gamma API is 4,000/10s, and Data API is 1,000/10s. Trading endpoints like POST /order have dual-tier limits: 3,500/10s burst and 36,000 per 10 minutes sustained. See the full rate limit tables above for every endpoint.

What happens when you hit a Polymarket rate limit?

Polymarket uses Cloudflare throttling, which means requests over the limit are delayed and queued rather than immediately rejected with a 429 error. This is different from hard rate limiting — your requests slow down before they fail. If throttling is insufficient, you receive HTTP 429 Too Many Requests. See How Polymarket Rate Limiting Works for the full breakdown.

How do you handle Polymarket 429 errors in Python?

Implement exponential backoff with jitter using the tenacity library or a custom retry decorator. Start with a 1-second delay, double on each retry up to 60 seconds, and add random jitter to prevent thundering herd. The py_clob_client SDK does not handle rate limiting automatically — you need to implement retry logic yourself. See Handling 429 Errors for production-ready code.

What are Polymarket burst vs sustained rate limits?

Trading endpoints have two limit tiers. Burst limits allow short spikes over 10-second windows (e.g., 3,500 POST /order requests in 10 seconds). Sustained limits enforce a lower average over 10-minute windows (e.g., 36,000 POST /order in 10 minutes, averaging 60/s). Both limits apply simultaneously — you can spike briefly but must stay under the sustained average. See the CLOB Trading Endpoints table for all trading limits.

Does the Polymarket Builder Program increase rate limits?

Yes. The Builder Program has three tiers: Unverified (default, no approval required), Verified (manual approval, higher throughput), and Partner (enterprise tier). Higher tiers unlock increased rate limits, gasless trading via Safe/Proxy wallets, weekly rewards, and priority support. Contact [email protected] to upgrade. See Builder Program Tiers for details.

What’s Next

py_clob_client Reference — Every SDK method with parameters, return types, and examples
Polymarket WebSocket & Orderbook Guide — Real-time streaming to eliminate polling
Polymarket Trading Bot Quickstart — From market scanning to production deployment
Prediction Market API Reference — Side-by-side Polymarket vs Kalshi vs unified APIs
Security Best Practices — Protect API keys and wallet credentials
Agent Betting Stack — The full four-layer framework for autonomous agents
Polymarket API: The Complete Developer Guide — Full API reference with all endpoints
Polymarket Auth Troubleshooting — Fix POLY_* header and signature errors
Kalshi API Guide — Kalshi-specific rate limit handling
Offshore Sportsbook API Guide — How data access works when there are no rate limits (or no API)

This guide is maintained by AgentBets.ai. Found an error or API change we missed? Let us know on Twitter.

Not financial advice. Built for builders.

Frequently Asked Questions

What is the Polymarket API rate limit?

Polymarket enforces rate limits via Cloudflare throttling with a general cap of 15,000 requests per 10 seconds. Specific endpoints have lower limits — CLOB general is 9,000/10s, Gamma API is 4,000/10s, and Data API is 1,000/10s. Trading endpoints like POST /order have dual-tier limits: 3,500/10s burst and 36,000 per 10 minutes sustained.

What happens when you hit a Polymarket rate limit?

How do you handle Polymarket 429 errors in Python?

Implement exponential backoff with jitter using the tenacity library or a custom retry decorator. Start with a 1-second delay, double on each retry up to 60 seconds, and add random jitter to prevent thundering herd. The py_clob_client SDK does not handle rate limiting automatically — you need to implement retry logic yourself.

What are Polymarket burst vs sustained rate limits?

Trading endpoints have two limit tiers. Burst limits allow short spikes over 10-second windows (e.g., 3,500 POST /order requests in 10 seconds). Sustained limits enforce a lower average over 10-minute windows (e.g., 36,000 POST /order in 10 minutes, averaging 60/s). Both limits apply simultaneously — you can spike briefly but must stay under the sustained average.

Does the Polymarket Builder Program increase rate limits?

How Polymarket Rate Limiting Works

Rate Limit Headers

X-RateLimit-Limit

X-RateLimit-Remaining

X-RateLimit-Reset

Complete Rate Limit Tables (March 2026)

General Rate Limits

CLOB API Rate Limits

CLOB Trading Endpoints (Burst + Sustained)

Gamma API Rate Limits

Data API Rate Limits

Other API Rate Limits

Builder Program Tiers

Handling 429 Errors

Exponential Backoff with Jitter (Python)

Manual Retry (No Dependencies)

Detecting Throttling Before 429

Rate Limit Budgeting for Agents

Use WebSocket Instead of Polling

Rate Limits vs Kalshi

Frequently Asked Questions

What is the Polymarket API rate limit?

What happens when you hit a Polymarket rate limit?

How do you handle Polymarket 429 errors in Python?

What are Polymarket burst vs sustained rate limits?

Does the Polymarket Builder Program increase rate limits?

What’s Next

Frequently Asked Questions

Related Guides

Polymarket API Tutorial: Python Authentication, Orders & WebSocket Streaming (2026)

Polymarket WebSocket Guide: Channels, Subscriptions & Real-Time Orderbook (2026)

py-clob-client-v2 Python Reference — Every CLOB V2 Method with Code Examples (2026)

Build a Polymarket Trading Bot in Python — Quickstart Guide (2026)

Security Best Practices for Agent Betting