Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kataven.ai/llms.txt

Use this file to discover all available pages before exploring further.

The Hub API enforces per-tenant rate limits on every authenticated request. Limits are operator-set; you cannot raise them yourself — contact support if you need more.

Default limits

Every account starts with these limits. They cover the majority of real workloads:
WindowLimit
Per minute60 requests
Per hour1,000 requests
Per day10,000 requests
Concurrent in-flight20 requests
Optional sub-limits per individual sk_live_ key may also apply (e.g., to compartmentalize a CI key that shouldn’t burn through your account budget). When set, they’re listed alongside the key in the Settings → API Keys page.

What 429 responses look like

Any limit exceeded returns:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Content-Type: text/plain

rate_limited: per-minute (60) exceeded
The Retry-After header is the number of seconds before the window resets. Both Python and Node SDKs surface this as a typed RateLimitError with the retryAfter field populated.

Counters

Limits use a sliding window in Redis:
  • Per-minute counter — INCR’d on each request, expires 60s after first request in the window.
  • Per-hour and per-day counters — same pattern.
  • Concurrent counter — INCR’d on request start, DECR’d on response (best-effort; idle counters expire after 1h).

Backoff guidance

When you receive a 429:
  1. Honor Retry-After. Sleep that many seconds before retrying.
  2. Add jitter. Sleep for Retry-After + random(0, Retry-After/10) to avoid synchronized retries from many clients.
  3. Don’t retry indefinitely. After 3 failed retries, surface the error to your caller.
The SDKs do steps 1-2 automatically when their retry: true option is set.

How limits are configured

Operator-controlled. Limits live in the system DB’s account_rate_limits table and are tuned per-tenant via the Operator Console → Accounts page (Kataven staff only). If you need a higher limit:
  1. Open a support ticket with the use case (concurrent bot fleet, bulk import, etc.).
  2. We adjust your tier and set the new caps. Effective within ~60s (cache invalidation).
We don’t expose the operator console to customers because rate limits are also a billing/abuse control — not a self-service knob.