Why RequestRocket has the most flexible rate limiter around

Most rate limiters are too coarse

A typical API gateway rate limiter answers one question: “has this client exceeded N requests per minute?” That’s useful as a blunt instrument but breaks down the moment you need to enforce different limits for different use cases sharing the same infrastructure.

Consider a realistic scenario: your platform has a free tier (60 RPM), a standard tier (300 RPM), and an enterprise tier (unlimited). You have three AI agent types — a research agent, a billing agent, and a support bot — each with different acceptable call patterns. You have both synchronous and async proxy routes, and a handful of high-priority endpoints that should never be throttled even if other limits are hit.

A single global rate limiter can’t express this. You end up either over-restricting high-priority traffic or under-protecting your vendor quotas.

RequestRocket’s layered approach

RequestRocket models rate limiting across multiple layers that can be composed:

Layer 1: Client-level defaults

The client configuration sets a global rate limit baseline — the default that applies across the account when individual proxies don’t specify their own limits:

PUT /clients/{clientId}/configuration
{
  "configuration": {
    "limits": {
      "requestsPerMinute": 300,
      "requestsPerDay": 20000
    }
  }
}

These are the default values new proxies are created with. Each proxy gets its own copy of these limits at creation time and can have them updated independently at any point. The platform maximum is 1,000 requests per minute and 20,000 per day.

Layer 2: Per-proxy rate limits

Each proxy has its own independent rate limits. You can set them explicitly at creation time, or update them on an existing proxy at any time:

PUT /clients/{clientId}/proxies/{proxyId}
{
  "proxyMaxRequestsPerMinute": 60,
  "proxyMaxRequestsPerDay": 2000
}

This is where the composability becomes useful. An account might have a default of 300 RPM, but a proxy serving a cost-sensitive AI agent can be capped at 60 RPM — tight enough to protect the vendor quota budget for that specific integration, without constraining other proxies in the same account.

Different agents, different proxies, different limits — all within a single account without needing separate account structures.

Layer 3: Retry behaviour as implicit throttling

Each proxy also controls how aggressively the gateway retries failed upstream requests. Setting these caps is another form of rate control — they bound the worst-case retry amplification from a single failed call:

PUT /clients/{clientId}/proxies/{proxyId}
{
  "proxyMaxRetries": 3,
  "proxyMaxBackoff": 15,
  "proxyMaxRequestsPerMinute": 50,
  "proxyMaxRequestsPerDay": 5000
}

A async proxy with proxyMaxRetries: 3 converts a single failed request into at most 4 total attempts (original + 3 retries). With exponential backoff capped at 10 seconds, the total retry window is bounded — preventing retry avalanches from a transiently degraded upstream. Meanwhile proxyMaxRequestPerMinute and PerDay prevent overwhelming target APIs with requests.

Layer 4: Rule-based access as request filtering

Rules can block entire categories of requests before they consume any quota. An allow list that permits only specific paths means requests to unconfigured paths are rejected immediately at the gateway — they don’t count against upstream rate limits because they never reach the upstream:

POST /clients/{clientId}/proxies/{proxyId}/rules
{
  "effect": "allow",
  "methods": ["POST"],
  "path": {
    "path": { "pattern": "^/v1/chat/completions$" },
    "presence": "must_exist"
  },
  "priority": 10,
  "notes": "Allow only chat completions — all other paths denied by default"
}

Combined with proxyDefaultRuleEffect: "deny", this is a soft rate limit on endpoint diversity — agents can only call the paths you’ve explicitly provisioned, not the full vendor API surface.

Layer 5: Credential-level rules

Rules can be applied at the credential level rather than the proxy level. Credential-level rules apply across all proxies that use that credential — so a single set of restrictions follows the credential wherever it’s used:

POST /clients/{clientId}/credentials/{credentialId}/rules
{
  "effect": "allow",
  "methods": ["GET"],
  "path": {
    "path": { "pattern": "^/v1/(contacts|companies)" },
    "presence": "must_exist"
  },
  "priority": 10,
  "notes": "Read-only access to CRM endpoints — applies to all proxies using this credential"
}

This means you can scope a credential to read-only access and know that no proxy configuration can override it, because credential-level rules take precedence over per-proxy rules.

Practical configuration for common scenarios

Scenario: different limits per agent type

For agents in the same account with different call budgets, set per-proxy limits directly on each proxy:

PUT /clients/{clientId}/proxies/{researchAgentProxyId}
{
  "proxyMaxRequestsPerMinute": 60,
  "proxyMaxRequestsPerDay": 2000
}

PUT /clients/{clientId}/proxies/{billingAgentProxyId}
{
  "proxyMaxRequestsPerMinute": 30,
  "proxyMaxRequestsPerDay": 500
}

Each proxy enforces its own budget. A research agent that hits its daily cap doesn’t affect the billing agent at all — their limits are completely independent of each other.

Scenario: protect high-value endpoints from retry noise

For proxies where upstream 429 responses should be handled with patience rather than speed, set conservative retry settings for your asynchronous call handling:

PUT /clients/{clientId}/proxies/{highValueProxyId}
{
  "proxyMaxRetries": 1,
  "proxyMaxBackoff": 30
}

One retry, up to 30 seconds wait. A caller that exceeds the upstream’s per-minute quota will wait for the backoff window before retrying — rather than hammering the endpoint and deepening the quota hole.

Next steps

Rate limiting configuration is available in every RequestRocket account. Read the RequestRocket configuration documentation for full limit parameters, or start for free.

Why RequestRocket has the most flexible rate limiter around

Most rate limiters are too coarse

RequestRocket’s layered approach

Layer 1: Client-level defaults

Layer 2: Per-proxy rate limits

Layer 3: Retry behaviour as implicit throttling

Layer 4: Rule-based access as request filtering

Layer 5: Credential-level rules

Practical configuration for common scenarios

Scenario: different limits per agent type

Scenario: protect high-value endpoints from retry noise

Next steps

Related posts

AI Agent Identity Isn't Enough: Enforce Access at Runtime

API Authentication Methods Explained: Keys, Tokens, OAuth2

AI Agent API Access: From Authentication to Downscoping

Add outbound API security without changing code

Add outbound API security
without changing code