Polymarket enforces all API rate limits via Cloudflare throttling — requests over the limit are queued, not rejected. This guide covers every endpoint’s limit as of March 2026, the burst vs sustained distinction for trading endpoints, the Builder Program tiers, and production-ready retry code for autonomous agents.
How Polymarket Rate Limiting Works
Polymarket’s rate limiting is throttle-based, not reject-based. When you exceed the configured rate for any endpoint, Cloudflare queues your requests and introduces latency rather than immediately returning HTTP 429. This is an important distinction for agent builders:
Traditional rate limiting: Request → 429 → Retry → Success
Polymarket throttling: Request → Queued → Delayed response → Success
Polymarket hard limit: Request → Queued → Still over limit → 429
Three things to know:
- Throttling comes first. Your requests slow down before they fail. If your agent suddenly sees response times spike from 50ms to 500ms, you’re being throttled.
- Burst allowances exist. Trading endpoints allow short spikes above the sustained rate — you can fire off a burst of orders and settle into a lower average.
- Sliding windows. All limits use sliding time windows (10 seconds or 10 minutes), not fixed calendar windows. There’s no “reset at the top of the minute.”
Rate Limit Headers
Every response from the Polymarket CLOB API includes rate limit headers. Monitor these to stay ahead of 429 errors.
X-RateLimit-Limit
The maximum number of requests allowed in the current time window.
X-RateLimit-Remaining
How many requests you have left in the current window. When this reaches 0, subsequent requests will return 429.
X-RateLimit-Reset
Unix timestamp (in seconds) when the rate limit window resets and your allowance is restored.
Reading rate limit headers in Python:
import requests
response = requests.get(
"https://clob.polymarket.com/price",
params={"token_id": "<token-id>", "side": "BUY"}
)
limit = response.headers.get("X-RateLimit-Limit")
remaining = response.headers.get("X-RateLimit-Remaining")
reset_at = response.headers.get("X-RateLimit-Reset")
print(f"Limit: {limit}")
print(f"Remaining: {remaining}")
print(f"Resets at: {reset_at}")
Proactive throttling: Rather than waiting for a 429, check X-RateLimit-Remaining after each request and slow down when it gets low. A simple rule: if remaining is below 20% of the limit, add a small delay before the next request.
Complete Rate Limit Tables (March 2026)
These tables reflect the current limits from Polymarket’s official documentation.
General Rate Limits
| Endpoint | Limit | Window |
|---|---|---|
| General (all APIs) | 15,000 req | 10 seconds |
| Health check (“OK”) | 100 req | 10 seconds |
The 15,000/10s general limit is the outer boundary. Individual API sections have their own lower limits that apply first.
CLOB API Rate Limits
The CLOB (Central Limit Order Book) is where all trading happens. These are the limits that matter most for autonomous agents.
General CLOB endpoints:
| Endpoint | Limit | Window |
|---|---|---|
| CLOB (general) | 9,000 req | 10 seconds |
| GET balance-allowance | 200 req | 10 seconds |
| UPDATE balance-allowance | 50 req | 10 seconds |
Market data endpoints:
| Endpoint | Limit | Window |
|---|---|---|
GET /book (single) | 1,500 req | 10 seconds |
POST /books (batch) | 500 req | 10 seconds |
GET /price (single) | 1,500 req | 10 seconds |
POST /prices (batch) | 500 req | 10 seconds |
GET /midprice (single) | 1,500 req | 10 seconds |
POST /midprices (batch) | 500 req | 10 seconds |
Ledger endpoints:
| Endpoint | Limit | Window |
|---|---|---|
/trades, /orders, /notifications, /order | 900 req | 10 seconds |
/data/orders | 500 req | 10 seconds |
/data/trades | 500 req | 10 seconds |
/notifications | 125 req | 10 seconds |
Price history & market info:
| Endpoint | Limit | Window |
|---|---|---|
| Price history | 1,000 req | 10 seconds |
| Market tick size | 200 req | 10 seconds |
Authentication:
| Endpoint | Limit | Window |
|---|---|---|
| API key operations | 100 req | 10 seconds |
CLOB Trading Endpoints (Burst + Sustained)
Trading endpoints are the only ones with dual-tier enforcement. Both limits apply simultaneously.
| Endpoint | Burst Limit (10s) | Sustained Limit (10min) | Effective Avg |
|---|---|---|---|
POST /order | 3,500 (500/s) | 36,000 (60/s) | 60/s |
DELETE /order | 3,000 (300/s) | 30,000 (50/s) | 50/s |
POST /orders (batch) | 1,000 (100/s) | 15,000 (25/s) | 25/s |
DELETE /orders (batch) | 1,000 (100/s) | 15,000 (25/s) | 25/s |
DELETE /cancel-all | 250 (25/s) | 6,000 (10/s) | 10/s |
DELETE /cancel-market-orders | 1,000 (100/s) | 1,500 (25/s) | 2.5/s |
How to read this table: Your agent can burst to 3,500 order placements in a 10-second window (useful for entering multiple positions quickly), but over a 10-minute window, you’re limited to 36,000 total — an average of 60 per second. If you burn your burst budget, you need to slow down or you’ll hit the sustained limit.
The /cancel-market-orders endpoint has a notably tight sustained limit (1,500/10min) compared to its burst allowance. If your agent needs to cancel orders across many markets frequently, use the batch /orders delete endpoint instead.
Gamma API Rate Limits
The Gamma API provides market metadata, events, tags, and search. These are read-only endpoints agents use for market discovery.
| Endpoint | Limit | Window |
|---|---|---|
| Gamma (general) | 4,000 req | 10 seconds |
GET /events | 500 req | 10 seconds |
GET /markets | 300 req | 10 seconds |
GET /markets + /events listing | 900 req | 10 seconds |
| GET comments | 200 req | 10 seconds |
| Tags | 200 req | 10 seconds |
| Search | 350 req | 10 seconds |
Data API Rate Limits
The Data API covers trades, positions, and analytics data.
| Endpoint | Limit | Window |
|---|---|---|
| Data API (general) | 1,000 req | 10 seconds |
/trades | 200 req | 10 seconds |
/positions | 150 req | 10 seconds |
/closed-positions | 150 req | 10 seconds |
| Health check (“OK”) | 100 req | 10 seconds |
Other API Rate Limits
| Endpoint | Limit | Window |
|---|---|---|
Relayer /submit | 25 req | 1 minute |
| User PNL API | 200 req | 10 seconds |
The Relayer has the tightest limit on the platform. If your agent uses gasless trading via the Builder Program, plan your submission cadence carefully — 25 per minute is about one every 2.4 seconds.
Builder Program Tiers
The Polymarket Builder Program uses a tiered system that directly affects rate limits. Higher tiers unlock more throughput.
| Tier | Approval | Rate Limits | Extras |
|---|---|---|---|
| Unverified | None — start immediately | Default limits (tables above) | Gasless trading, gas-subsidized Relayer transactions (daily limit) |
| Verified | Manual approval required | Increased limits over Unverified | Higher daily Relayer limit, weekly rewards, engineering support |
| Partner | Enterprise application | Highest limits | Revenue sharing, marketing promotion, priority access |
To upgrade: Email [email protected] with your Builder API Key, use case, expected volume, and relevant links (app, docs, X profile).
If you’re building an autonomous agent that trades consistently, getting to Verified tier should be a priority. The default Unverified limits are generous for development and testing, but production agents monitoring multiple markets will hit them.
Important: If you only need more Relayer transaction capacity for your own wallet (not routing orders for others), you can get unlimited daily Relay transactions by obtaining a Relayer API key without upgrading tiers.
For detailed Relayer Client setup including SDK packages and BuilderConfig code examples, see the Polymarket API Guide — Builder Program.
Handling 429 Errors
When throttling isn’t enough to keep you within limits, Polymarket returns HTTP 429:
{
"error": "Too Many Requests"
}
Exponential Backoff with Jitter (Python)
This is the pattern every production agent should implement. The tenacity library handles this cleanly with py_clob_client:
import time
import random
from tenacity import retry, stop_after_attempt, wait_exponential_jitter
from py_clob_client.client import ClobClient
client = ClobClient(
"https://clob.polymarket.com",
key="<your-private-key>",
chain_id=137
)
client.set_api_creds(client.create_or_derive_api_creds())
@retry(
stop=stop_after_attempt(5),
wait=wait_exponential_jitter(
initial=1, # Start at 1 second
max=60, # Cap at 60 seconds
jitter=2 # Add up to 2s random jitter
)
)
def get_orderbook_safe(token_id: str):
"""Fetch orderbook with automatic retry on throttle."""
return client.get_order_book(token_id)
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential_jitter(initial=2, max=30, jitter=1)
)
def place_order_safe(order):
"""Place order with retry — fewer attempts, longer initial wait."""
return client.post_order(order)
Why jitter matters: Without jitter, if 10 agents all hit 429 at the same time, they all retry at the same time and hit 429 again. Jitter spreads retries across a random window, breaking the thundering herd pattern.
Manual Retry (No Dependencies)
If you don’t want to use tenacity:
import time
import random
def retry_with_backoff(fn, max_retries=5, base_delay=1.0):
"""Execute fn with exponential backoff + jitter on failure."""
for attempt in range(max_retries):
try:
return fn()
except Exception as e:
if "429" in str(e) or "Too Many Requests" in str(e):
delay = min(base_delay * (2 ** attempt), 60)
jitter = random.uniform(0, delay * 0.5)
time.sleep(delay + jitter)
else:
raise
raise Exception(f"Failed after {max_retries} retries")
# Usage
book = retry_with_backoff(
lambda: client.get_order_book("TOKEN_ID_HERE")
)
Detecting Throttling Before 429
Smart agents detect throttling before hitting hard limits. Monitor your response times:
import time
class ThrottleDetector:
def __init__(self, baseline_ms=100, threshold_multiplier=3):
self.baseline_ms = baseline_ms
self.threshold = baseline_ms * threshold_multiplier
self.recent_latencies = []
def record(self, latency_ms: float):
self.recent_latencies.append(latency_ms)
if len(self.recent_latencies) > 20:
self.recent_latencies.pop(0)
@property
def is_throttled(self) -> bool:
if len(self.recent_latencies) < 3:
return False
avg = sum(self.recent_latencies[-5:]) / min(5, len(self.recent_latencies))
return avg > self.threshold
def suggested_delay(self) -> float:
"""Return seconds to wait before next request."""
if not self.is_throttled:
return 0
avg = sum(self.recent_latencies[-5:]) / 5
return min((avg / self.baseline_ms) * 0.5, 10.0)
# Usage in your agent loop
detector = ThrottleDetector(baseline_ms=80)
start = time.time()
book = client.get_order_book(token_id)
latency = (time.time() - start) * 1000
detector.record(latency)
if detector.is_throttled:
time.sleep(detector.suggested_delay())
Rate Limit Budgeting for Agents
An autonomous agent typically needs to:
- Scan markets — Gamma API calls to find opportunities
- Check prices — CLOB orderbook/price queries
- Check positions — Data API position tracking
- Execute trades — CLOB order placement/cancellation
Here’s a rate budget for an agent monitoring 50 markets:
┌─────────────────────────────────────────────────┐
│ AGENT RATE BUDGET (50 markets, per 10 seconds) │
├─────────────────────┬───────────┬───────────────┤
│ Task │ Requests │ Limit │
├─────────────────────┼───────────┼───────────────┤
│ Gamma market scan │ 50 │ 300/10s │
│ Orderbook checks │ 50 │ 1,500/10s │
│ Price checks │ 50 │ 1,500/10s │
│ Position tracking │ 10 │ 150/10s │
│ Order placement │ 5 │ 3,500/10s │
│ Order cancellation │ 5 │ 3,000/10s │
├─────────────────────┼───────────┼───────────────┤
│ TOTAL │ 170 │ 9,000/10s │
└─────────────────────┴───────────┴───────────────┘
At 170 requests per 10-second cycle, this agent uses under 2% of the CLOB general limit. You have significant headroom — the constraint is usually on specific endpoints, not the general cap.
Optimization: Use batch endpoints. Instead of 50 individual GET /book calls, use a single POST /books call. This counts as 1 request against the batch limit (500/10s) instead of 50 against the single limit (1,500/10s).
# Bad: 50 requests against GET /book (1,500/10s limit)
for token_id in token_ids:
book = client.get_order_book(token_id)
# Good: 1 request against POST /books (500/10s limit)
books = client.get_order_books(token_ids)
Use WebSocket Instead of Polling
The single best way to reduce rate limit pressure is to stop polling. Polymarket’s WebSocket API streams real-time orderbook updates and trades — no polling required.
import json
import websockets
async def stream_orderbook(token_ids: list[str]):
"""Stream real-time orderbook updates via WebSocket."""
uri = "wss://ws-subscriptions-clob.polymarket.com/ws/market"
async with websockets.connect(uri) as ws:
subscribe = {
"type": "subscribe",
"channel": "market",
"assets_id": token_ids
}
await ws.send(json.dumps(subscribe))
async for message in ws:
data = json.loads(message)
yield data
As of the January 2026 changelog update, the 100 token subscription limit has been removed from the Markets channel — you can subscribe to as many token IDs as your agent needs.
Agent pattern: Use WebSocket for real-time data (orderbooks, trades, price changes) and REST only for one-time lookups (market metadata, position snapshots, order placement).
Rate Limits vs Kalshi
For agents that trade across both platforms, here’s how the limits compare:
| Metric | Polymarket | Kalshi |
|---|---|---|
| Enforcement | Cloudflare throttle (queue then 429) | Hard reject (immediate 429) |
| General limit | 15,000 / 10s | Varies by endpoint |
| Order placement | 3,500 / 10s burst | Lower throughput |
| Authentication | EIP-712 + HMAC (L1/L2) | RSA-PSS or Bearer token |
| WebSocket | Unlimited token subscriptions | Per-connection limits |
| Upgrade path | Builder Program tiers | Contact sales |
Polymarket’s throttle-first approach is more forgiving for agents — your requests degrade gracefully instead of failing hard. See the Prediction Market API Reference for the full cross-platform comparison.
Frequently Asked Questions
What is the Polymarket API rate limit?
Polymarket enforces rate limits via Cloudflare throttling with a general cap of 15,000 requests per 10 seconds. Specific endpoints have lower limits — CLOB general is 9,000/10s, Gamma API is 4,000/10s, and Data API is 1,000/10s. Trading endpoints like POST /order have dual-tier limits: 3,500/10s burst and 36,000 per 10 minutes sustained. See the full rate limit tables above for every endpoint.
What happens when you hit a Polymarket rate limit?
Polymarket uses Cloudflare throttling, which means requests over the limit are delayed and queued rather than immediately rejected with a 429 error. This is different from hard rate limiting — your requests slow down before they fail. If throttling is insufficient, you receive HTTP 429 Too Many Requests. See How Polymarket Rate Limiting Works for the full breakdown.
How do you handle Polymarket 429 errors in Python?
Implement exponential backoff with jitter using the tenacity library or a custom retry decorator. Start with a 1-second delay, double on each retry up to 60 seconds, and add random jitter to prevent thundering herd. The py_clob_client SDK does not handle rate limiting automatically — you need to implement retry logic yourself. See Handling 429 Errors for production-ready code.
What are Polymarket burst vs sustained rate limits?
Trading endpoints have two limit tiers. Burst limits allow short spikes over 10-second windows (e.g., 3,500 POST /order requests in 10 seconds). Sustained limits enforce a lower average over 10-minute windows (e.g., 36,000 POST /order in 10 minutes, averaging 60/s). Both limits apply simultaneously — you can spike briefly but must stay under the sustained average. See the CLOB Trading Endpoints table for all trading limits.
Does the Polymarket Builder Program increase rate limits?
Yes. The Builder Program has three tiers: Unverified (default, no approval required), Verified (manual approval, higher throughput), and Partner (enterprise tier). Higher tiers unlock increased rate limits, gasless trading via Safe/Proxy wallets, weekly rewards, and priority support. Contact [email protected] to upgrade. See Builder Program Tiers for details.
What’s Next
- py_clob_client Reference — Every SDK method with parameters, return types, and examples
- Polymarket WebSocket & Orderbook Guide — Real-time streaming to eliminate polling
- Polymarket Trading Bot Quickstart — From market scanning to production deployment
- Prediction Market API Reference — Side-by-side Polymarket vs Kalshi vs unified APIs
- Security Best Practices — Protect API keys and wallet credentials
- Agent Betting Stack — The full four-layer framework for autonomous agents
- Polymarket API: The Complete Developer Guide — Full API reference with all endpoints
- Polymarket Auth Troubleshooting — Fix POLY_* header and signature errors
- Kalshi API Guide — Kalshi-specific rate limit handling
- Offshore Sportsbook API Guide — How data access works when there are no rate limits (or no API)
This guide is maintained by AgentBets.ai. Found an error or API change we missed? Let us know on Twitter.
Not financial advice. Built for builders.
