News KrakenD Enterprise v2.6 released with OpenTelemetry, FIPS-140, gRPC server and more

Enterprise Documentation

Recent changes

You are viewing a previous version of KrakenD Enterprise Edition (v2.1) , go to the latest version

Service rate limit (stateless)

Document updated on Oct 26, 2021

The service rate limit feature allows you to set the maximum requests per second a user or group of users can do to KrakenD and works analogously to the endpoint rate limit. There are two different strategies to set limits that you can use, simultaneously or individually:

  • Service rate-limit: Defines the rate-limit that all users of your API can do together, sharing the same counter. For instance, you might want to limit the interaction from users to KrakenD to 10,000 requests/second to avoid a possible DDoS propagating to your backend services.
  • Client rate-limit: Rate-limits what an individual user can do against all endpoints. For instance, a personal JWT token can make 100 requests/second to the API.

Both types keep in-memory an updated Token Bucket expressing the remaining requests for a user or group of users.

For additional types of rate-limiting see the Traffic management overview.

Service rate-limiting (max_rate)

The service rate limit acts on the number of simultaneous transactions KrakenD allows to process. This type of limit protects the service from all customers. In addition, these limits mitigate abusive actions such as rapidly writing content, aggressive polling, or excessive API calls.

It consumes an insignificant amount of memory as it only needs one counter for the whole system.

When the users connected to an endpoint together exceed the max_rate, KrakenD starts to reject connections with a status code 503 Service Unavailable and enables a Spike Arrest policy

Service client rate-limiting (client_max_rate)

The client or user rate limit applies to an individual user to all endpoints. Each endpoint can additionally have sub-limit rates, but all users are subject to the same rate.

The client_max_rate tracks one counter per different client.

When a single user connected to an endpoint exceeds their client_max_rate, KrakenD starts to reject connections with a status code 429 Too Many Requests and enables a Spike Arrest policy

Service rate-limiting configuration

The configuration allows you to use both types of rate limits at the same time:

{
    "version": 3,
    "extra_config": {
      "qos/ratelimit/service": {
          "max_rate": 50,
          "client_max_rate": 5,
          "strategy": "ip"
        }
    }
}
Fields of "qos/ratelimit/router"
* required fields
Minimum configuration needs any of: max_rate , or client_max_rate
capacity

integer
Number of tokens you can store in the Token Bucket. Translates into the maximum requests this endpoint will accept for all users at a given time.
client_capacity

integer
Number of tokens you can store in the Token Bucket for each individual user. Traduces into maximum concurrent requests this endpoint will accept for the connected user. The client is defined by the strategy field. The client_max_rate keeps a counter for every client and endpoint.
client_max_rate

number
Number of tokens added per second to the Token Bucket for each individual user (user quota). Use decimals for per-hour and per-minute strategies. The remaining tokens for a user are the requests per second a specific user can do. The client is defined by strategy. Instead of counting all the connections to the endpoint as the option above, the client_max_rate keeps a counter for every client and endpoint. Keep in mind that every KrakenD instance keeps its counters in memory for every single client.
key

string
Available when using client_max_rate. Sets the header containing the user identification (e.g., Authorization) or IP (e.g.,X-Original-Forwarded-For). When the header contains a list of space-separated IPs, it will take the IP from the client that hit the first trusted proxy.
Example: "X-TOKEN"
max_rate

number
Sets the number of tokens added per second to the Token Bucket. Use decimals for per-hour and per-minute strategies. The remaining tokens in the bucket are the maximum requests the endpoint can handle at once. The absence of max_rate in the configuration or 0 is the equivalent to no limitation.
strategy

string
Available when using client_max_rate. Sets the strategy you will use to set client counters. Choose ip when the restrictions apply to the client’s IP address, or set it to header when there is a header that identifies a user uniquely. That header must be defined with the key entry.
Possible values are: "ip" , "header"

A note on strategies:

  • "strategy": "ip" When the restrictions apply to the client’s IP, and every IP is considered to be a different user. Optionally a key can be used to extract the IP from a custom header:
    • E.g, set "key": "X-Original-Forwarded-For" to extract the IP from a header containing a list of space-separated IPs (will take the first one).
  • "strategy": "header" When the criteria for identifying a user comes from the value of a key inside the header. With this strategy, the key must also be present.
    • E.g., set "key": "X-TOKEN" to use the X-TOKEN header as the unique user identifier.
Scarf

Unresolved issues?

The documentation is only a piece of the help you can get! Whether you are looking for Open Source or Enterprise support, see more support channels that can help you.