Make your limits predictable, not mysterious
The fastest way to frustrate developers is to throttle their requests without telling them the rules. A clear rate limit policy turns an opaque 429 into something a client can plan around. This builder generates a complete policy section for your API docs: the algorithm, per-tier limits, the response headers you send, a concrete 429 example, and the backoff behavior you expect from clients.
How it works
You pick a limiting algorithm and a scope. With a token-bucket or leaky-bucket model, the tool distinguishes a sustained rate (the steady refill — e.g. 600 requests per minute) from a burst allowance (a short spike the bucket absorbs before throttling kicks in). Scope decides what the limit is counted against: an API key, a client IP, or an authenticated user.
For each tier you define the limit, window, and burst. The policy then documents the RateLimit-Limit, RateLimit-Remaining, and RateLimit-Reset headers sent on every response, plus an optional Retry-After header on 429s, and renders a realistic 429 response with a JSON error body.
Tips and example
- Always surface remaining quota in headers so well-built clients never hit the limit — they can slow down on their own.
- Separate sustained and burst rates. Allowing a small burst makes the API feel responsive without letting clients sustain an abusive rate.
- Document the backoff you expect: honor
Retry-After, then exponential backoff with jitter. Without jitter, every throttled client retries in lockstep and re-overloads you. - Tie tiers to plans (Free, Pro, Enterprise) so higher limits become a tangible reason to upgrade.