Skip to main content
API
Available on Full Domination
3 min read

Rate limits

Per-tier limits, the retry-after header, and best practices.

Last updated May 12, 2026

Limits by tier

Only Full Domination has API access; we document a single rate-limit profile.

Surface Limit
Per token, per minute 120
Per token, per day 50,000
Per workspace, per minute (all tokens combined) 300
Per workspace, per day 200,000
Concurrent in-flight requests per token 20

If you need higher limits, contact support — the limits are conservative defaults, not a hard ceiling.

Headers on every response

Every response includes:

Header Meaning
X-RateLimit-Limit The applicable per-minute limit for this token.
X-RateLimit-Remaining Calls remaining in the current 60-second window.
X-RateLimit-Reset UTC seconds until the window resets.

On a 429 response, an additional header:

Header Meaning
Retry-After Seconds to wait before retrying. Always present on 429.

What happens at the limit

A request that would exceed the per-minute limit returns immediately with:

HTTP/1.1 429 Too Many Requests
Retry-After: 32
Content-Type: application/json

{
  "error": {
    "code": "rate_limited",
    "message": "Too many requests. Retry after 32 seconds.",
    "retryAfterSec": 32
  }
}

The request did NOT count against your daily allowance — only the per-minute window.

Exceeding the daily allowance

Daily-allowance overruns return the same shape with a longer retryAfterSec. The daily counter resets at 00:00 UTC.

Concurrent-limit behaviour

If you have 20 requests in flight and issue a 21st, the 21st returns:

{ "error": { "code": "concurrent_limit", "retryAfterSec": 1 } }

Cap your client's concurrency at 15–18 to leave headroom.

Best practices

Use cursors and limit=100

Fewer, larger requests use less of your budget than many small requests. See Pagination.

Batch where possible

Some endpoints accept batched input (e.g. POST /v1/content/draft with up to 10 briefs). Batching is one rate-limit unit regardless of batch size.

Cache safely

GET responses can be cached for short windows. Many endpoints return an ETag you can use with If-None-Match to get a 304 (no body, no rate-limit cost on the next call).

Backoff with jitter

When you hit 429, respect Retry-After and add random jitter to avoid thundering-herd on retry:

const baseDelay = Number(headers["retry-after"]) * 1000;
const jitter = Math.random() * 500;
await new Promise(r => setTimeout(r, baseDelay + jitter));

Don't poll for completion

For long-running operations (audits, bulk content generation), use webhooks (see Generic webhooks) rather than polling. Polling consumes rate-limit budget for no useful work.

Test-mode keys

ad_test_ keys share the live rate limits — there's no separate sandbox bucket. The advantage of test keys is they don't produce audit-log entries and they're isolated from your billing surface, not that they have a different quota.

Monitoring your usage

Settings → API → Usage shows:

  • Calls in the current minute, current hour, current day.
  • A 7-day trend chart.
  • A per-key breakdown of what's calling.

Spikes in this view typically mean one of two things: a bug in client code, or a new integration's first ingest. Investigate.

Soft warnings

When your daily usage exceeds 80% of the allowance, the workspace owner gets an in-app banner. Plan ahead for the rest of the day.

Was this article helpful?

Related docs

Rate limits · AI Domination