Rate limits

Rate limits exist to keep the service fast and fair.

Limits per plan

Plan	Per minute	Per day
Free	30	1,000
Pro	300	30,000
Platform	3,000 (configurable)	unlimited

Limits are per-org, not per-key. All API keys for an org share the same bucket.

Headers

When you exceed a limit you get HTTP 429 Too Many Requests with:

retry-after: 60
content-type: application/json

{
  "error": {
    "code": "rate_limited",
    "message": "Too many requests in the last minute",
    "traceId": "...",
    "retryable": true
  }
}

retry-after is in seconds. Honor it.

Strategy

Client-side:

Implement exponential backoff with jitter
Cache responses where the input is stable (e.g., describe_image of the same image)
Batch when the API supports it (telemetry)

Server-side:

We use a fixed-window counter per (org × minute) and (org × day) stored in Cloudflare KV. Burst capacity is the per-minute limit.

Need more

Email sales@wholisphere.ai — we’ll lift limits for legitimate Platform use cases. Free / Pro limits are intentionally tight to keep cloud LLM costs predictable.