Rate limits
Per-tier limits, the retry-after header, and best practices.
Last updated May 12, 2026
Limits by tier
Only Full Domination has API access; we document a single rate-limit profile.
| Surface | Limit |
|---|---|
| Per token, per minute | 120 |
| Per token, per day | 50,000 |
| Per workspace, per minute (all tokens combined) | 300 |
| Per workspace, per day | 200,000 |
| Concurrent in-flight requests per token | 20 |
If you need higher limits, contact support — the limits are conservative defaults, not a hard ceiling.
Headers on every response
Every response includes:
| Header | Meaning |
|---|---|
X-RateLimit-Limit |
The applicable per-minute limit for this token. |
X-RateLimit-Remaining |
Calls remaining in the current 60-second window. |
X-RateLimit-Reset |
UTC seconds until the window resets. |
On a 429 response, an additional header:
| Header | Meaning |
|---|---|
Retry-After |
Seconds to wait before retrying. Always present on 429. |
What happens at the limit
A request that would exceed the per-minute limit returns immediately with:
HTTP/1.1 429 Too Many Requests
Retry-After: 32
Content-Type: application/json
{
"error": {
"code": "rate_limited",
"message": "Too many requests. Retry after 32 seconds.",
"retryAfterSec": 32
}
}
The request did NOT count against your daily allowance — only the per-minute window.
Exceeding the daily allowance
Daily-allowance overruns return the same shape with a longer retryAfterSec. The daily counter resets at 00:00 UTC.
Concurrent-limit behaviour
If you have 20 requests in flight and issue a 21st, the 21st returns:
{ "error": { "code": "concurrent_limit", "retryAfterSec": 1 } }
Cap your client's concurrency at 15–18 to leave headroom.
Best practices
Use cursors and limit=100
Fewer, larger requests use less of your budget than many small requests. See Pagination.
Batch where possible
Some endpoints accept batched input (e.g. POST /v1/content/draft with up to 10 briefs). Batching is one rate-limit unit regardless of batch size.
Cache safely
GET responses can be cached for short windows. Many endpoints return an ETag you can use with If-None-Match to get a 304 (no body, no rate-limit cost on the next call).
Backoff with jitter
When you hit 429, respect Retry-After and add random jitter to avoid thundering-herd on retry:
const baseDelay = Number(headers["retry-after"]) * 1000;
const jitter = Math.random() * 500;
await new Promise(r => setTimeout(r, baseDelay + jitter));
Don't poll for completion
For long-running operations (audits, bulk content generation), use webhooks (see Generic webhooks) rather than polling. Polling consumes rate-limit budget for no useful work.
Test-mode keys
ad_test_ keys share the live rate limits — there's no separate sandbox bucket. The advantage of test keys is they don't produce audit-log entries and they're isolated from your billing surface, not that they have a different quota.
Monitoring your usage
Settings → API → Usage shows:
- Calls in the current minute, current hour, current day.
- A 7-day trend chart.
- A per-key breakdown of what's calling.
Spikes in this view typically mean one of two things: a bug in client code, or a new integration's first ingest. Investigate.
Soft warnings
When your daily usage exceeds 80% of the allowance, the workspace owner gets an in-app banner. Plan ahead for the rest of the day.
Was this article helpful?