When you’re building a trading bot or agent on Polymarket, rate limits are one of the first production issues you’ll hit. This guide covers how Polymarket’s rate limiting works, what the response headers mean, and how to implement proper retry logic so your bot stays running.

For a general overview of the Polymarket API, see the Polymarket API Guide. For py_clob_client method documentation, see the py_clob_client Reference.


Rate Limit Overview

Polymarket enforces rate limits per API key, with different tiers for different endpoint categories. Exact rate limit numbers are not publicly documented and may change, but the system follows a tiered approach.

Endpoint CategoryAuth RequiredRelative LimitExamples
Public data (read)NoHighestGET /price, GET /midpoint, GET /book
Gamma APINoHighGET /markets, GET /events
Authenticated readYes (L2)MediumGET /orders, GET /trades
TradingYes (L2)LowestPOST /order, DELETE /order
Batch tradingYes (L2)LowestPOST /orders (but more efficient per order)

The key insight: batch endpoints like POST /orders count as one request but can contain up to 15 orders. Using batch operations is the most effective way to maximize throughput within rate limits.


Rate Limit Headers

Every response from the Polymarket CLOB API includes rate limit headers. Monitor these to stay ahead of 429 errors.

X-RateLimit-Limit

The maximum number of requests allowed in the current time window.

X-RateLimit-Remaining

How many requests you have left in the current window. When this reaches 0, subsequent requests will return 429.

X-RateLimit-Reset

Unix timestamp (in seconds) when the rate limit window resets and your allowance is restored.

Reading rate limit headers in Python:

import requests

response = requests.get(
    "https://clob.polymarket.com/price",
    params={"token_id": "<token-id>", "side": "BUY"}
)

limit = response.headers.get("X-RateLimit-Limit")
remaining = response.headers.get("X-RateLimit-Remaining")
reset_at = response.headers.get("X-RateLimit-Reset")

print(f"Limit: {limit}")
print(f"Remaining: {remaining}")
print(f"Resets at: {reset_at}")

Proactive throttling: Rather than waiting for a 429, check X-RateLimit-Remaining after each request and slow down when it gets low. A simple rule: if remaining is below 20% of the limit, add a small delay before the next request.


HTTP 429: Too Many Requests

When you exceed the rate limit, the API returns HTTP status 429.

What Causes a 429 Error

  • Sending too many requests in a short window (most common)
  • Polling endpoints rapidly instead of using WebSockets
  • Multiple bot instances sharing the same API key
  • Placing many individual orders instead of using batch endpoints

The 429 Response

HTTP/1.1 429 Too Many Requests
Retry-After: 5
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1709150400

The Retry-After header tells you how many seconds to wait before retrying. Always respect this value.

What py_clob_client Does on 429

The py_clob_client SDK does not automatically retry on 429 errors. It will raise an exception that you need to catch and handle. This is by design — the SDK leaves retry policy to the application developer.


Retry Strategies

Exponential Backoff with Jitter (Python)

The standard approach: wait longer between each retry, with randomness to prevent synchronized retries from multiple clients.

import time
import random
import requests

def request_with_retry(method, url, max_retries=5, **kwargs):
    """Make an HTTP request with exponential backoff on 429 errors."""
    for attempt in range(max_retries):
        response = requests.request(method, url, **kwargs)

        if response.status_code != 429:
            return response

        # Use Retry-After header if present, otherwise calculate backoff
        retry_after = response.headers.get("Retry-After")
        if retry_after:
            wait_time = float(retry_after)
        else:
            wait_time = min(2 ** attempt + random.uniform(0, 1), 60)

        print(f"Rate limited. Retrying in {wait_time:.1f}s (attempt {attempt + 1}/{max_retries})")
        time.sleep(wait_time)

    return response  # Return last response if all retries exhausted

Exponential Backoff with Jitter (TypeScript)

async function requestWithRetry(
  fn: () => Promise<Response>,
  maxRetries: number = 5
): Promise<Response> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fn();

    if (response.status !== 429) {
      return response;
    }

    const retryAfter = response.headers.get("Retry-After");
    const waitTime = retryAfter
      ? parseFloat(retryAfter)
      : Math.min(2 ** attempt + Math.random(), 60);

    console.log(`Rate limited. Retrying in ${waitTime.toFixed(1)}s`);
    await new Promise((resolve) => setTimeout(resolve, waitTime * 1000));
  }

  throw new Error(`Rate limited after ${maxRetries} retries`);
}

Wrapping py_clob_client with Automatic Retry

You can add retry logic to any ClobClient method using a decorator:

import time
import random
from functools import wraps

def with_retry(max_retries=3, base_delay=1.0):
    """Decorator that retries on rate limit errors."""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_retries):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    if "429" in str(e) and attempt < max_retries - 1:
                        delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
                        print(f"Rate limited, retrying in {delay:.1f}s")
                        time.sleep(delay)
                    else:
                        raise
        return wrapper
    return decorator

# Usage: wrap specific calls
@with_retry(max_retries=3)
def get_price_safe(client, token_id, side):
    return client.get_price(token_id=token_id, side=side)

price = get_price_safe(client, "<token-id>", "BUY")

Staying Within Rate Limits

Use WebSockets Instead of Polling

If you’re polling GET /price or GET /book repeatedly, switch to WebSockets. Polymarket’s WebSocket feed pushes updates to you in real time, eliminating the need for polling entirely.

import websocket
import json

ws = websocket.create_connection(
    "wss://ws-subscriptions-clob.polymarket.com/ws/market"
)

# Subscribe to a market's order book updates
ws.send(json.dumps({
    "type": "subscribe",
    "channel": "market",
    "assets_id": "<token-id>"
}))

# Receive updates without making HTTP requests
while True:
    message = json.loads(ws.recv())
    print(f"Price update: {message}")

WebSocket connections do not count against REST API rate limits.

Batch Order Operations

Instead of placing orders one at a time, use the batch endpoint. One batch request replaces up to 15 individual requests.

# Bad: 10 requests
for price in prices:
    order = OrderArgs(token_id=tid, price=price, size=50.0, side=BUY)
    signed = client.create_order(order)
    client.post_order(signed, OrderType.GTC)  # 1 request each

# Good: 1 request
orders = []
for price in prices:
    order = OrderArgs(token_id=tid, price=price, size=50.0, side=BUY)
    orders.append(client.create_order(order))
client.post_orders(orders, OrderType.GTC)  # 1 request total

Cache Gamma API Responses

Market metadata from the Gamma API (market questions, token IDs, event structures) changes infrequently. Cache these locally instead of fetching them on every cycle.

import time

class GammaCache:
    def __init__(self, ttl_seconds=300):
        self._cache = {}
        self._ttl = ttl_seconds

    def get_markets(self, params):
        key = str(params)
        if key in self._cache:
            data, timestamp = self._cache[key]
            if time.time() - timestamp < self._ttl:
                return data

        import requests
        data = requests.get(
            "https://gamma-api.polymarket.com/markets",
            params=params
        ).json()
        self._cache[key] = (data, time.time())
        return data

A 5-minute TTL (300 seconds) is reasonable for most use cases. Market metadata like questions, token IDs, and event structures rarely change.

Apply for the Market Maker Program

If you’re running a serious market-making operation and consistently hitting rate limits, Polymarket offers elevated limits through their Market Maker program. Requirements include minimum volume thresholds and consistent quoting. See the Polymarket Market Maker documentation for details.


Rate Limits by Endpoint

This table summarizes the rate limit behavior by endpoint. Exact request-per-second numbers are not published by Polymarket and may vary.

EndpointMethodTierNotes
/priceGETPublicHighest limits. Consider WebSocket instead
/midpointGETPublicSame tier as /price
/bookGETPublicReturns full order book. Cache when possible
/booksGETPublicBatch version — more efficient than multiple /book calls
/orderPOSTTradingStricter limits. Use /orders for batch
/ordersPOSTTradingUp to 15 orders per request
/orderDELETETradingSingle order cancellation
/ordersDELETETradingCancels all open orders
/ordersGETAuthenticatedRead-only, medium limits
/tradesGETAuthenticatedRead-only, medium limits

Comparison: Polymarket vs Kalshi Rate Limits

PolymarketKalshi
Rate limit stylePer API key, tiered by endpointPer API key, tiered by endpoint
429 responseYes, with Retry-After headerYes, with Retry-After header
Rate limit headersX-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-ResetSimilar pattern
Published limitsNot officially documentedNot officially documented
Demo environmentNo sandbox availableDemo API at demo-api.kalshi.co with more generous limits
WebSocket alternativeYes — wss://ws-subscriptions-clob.polymarket.comYes — wss://api.elections.kalshi.com/trade-api/ws/v2
Batch endpoints/orders — up to 15 per callIndividual orders only
Elevated limitsMarket Maker programContact support

For Kalshi-specific rate limit handling, see the Kalshi API Guide.


Frequently Asked Questions

What are Polymarket’s API rate limits?

Polymarket enforces rate limits that vary by endpoint tier. Public data endpoints (prices, order books) have higher limits than authenticated trading endpoints (order placement, cancellation). Exact numbers are not published but are enforced per API key. Monitor X-RateLimit-Remaining in response headers to track your usage.

How do I handle a 429 error from the Polymarket API?

When you receive HTTP 429, read the Retry-After header for the wait time in seconds. Implement exponential backoff with jitter — start with a 1-second delay, double it each retry, and add a small random component. See the retry code examples above.

Does py_clob_client handle rate limits automatically?

No. The py_clob_client SDK does not include built-in rate limit handling. You need to implement your own retry logic. See Wrapping py_clob_client with Automatic Retry for a decorator-based approach.

How can I avoid hitting Polymarket rate limits?

Four strategies: (1) Use WebSockets instead of polling for real-time data, (2) use batch order endpoints — POST /orders handles up to 15 orders in one request, (3) cache Gamma API responses for market metadata that changes infrequently, and (4) apply for the Market Maker program if you need elevated limits.


See Also


This guide is maintained by AgentBets.ai. Found an error or API change we missed? Let us know on Twitter.

Not financial advice. Built for builders.