Rate Limit Design

9 minute read

Overview

Rate limit is a feature that allows the user to limit the number of incoming requests to a predefined value based on attributes within the traffic flow.

Here are some reasons why a user may want to implement Rate limits

To prevent malicious activity such as DDoS attacks.
To prevent applications and its resources (such as a database) from getting overloaded.
To create API limits based on user entitlements.

Scope Types

The rate limit type here describes the scope of rate limits.

Global - In this case, the rate limit is common across all the instances of Envoy proxies where its applied i.e. if the data plane has 2 replicas of Envoy running, and the rate limit is 10 requests/second, this limit is common and will be hit if 5 requests pass through the first replica and 5 requests pass through the second replica within the same second.
Local - In this case, the rate limits are specific to each instance/replica of Envoy running. Note - This is not part of the initial design and will be added as a future enhancement.

Match Types

Rate limit a specific traffic flow

Here is an example of a ratelimit implemented by the application developer to limit a specific user by matching on a custom x-user-id header with a value set to one

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: ratelimit-specific-user
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: example
  rateLimit:
    type: Global
    global:
      rules:
      - clientSelectors:
        - headers:
          - name: x-user-id
            value: one
        limit:
          requests: 10
          unit: Hour
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: example
spec:
  parentRefs:
  - name: eg
  hostnames:
  - www.example.com
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /foo
    filters:
    - type: ExtensionRef
      extensionRef:
        group: gateway.envoyproxy.io
        kind: RateLimitFilter
        name: ratelimit-specific-user
    backendRefs:
    - name: backend
      port: 3000

Rate limit all traffic flows

Here is an example of a rate limit implemented by the application developer that limits the total requests made to a specific route to safeguard health of internal application components. In this case, no specific headers match is specified, and the rate limit is applied to all traffic flows accepted by this HTTPRoute.

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: ratelimit-all-requests
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: example
  rateLimit:
    type: Global
    global:
      rules:
      - limit:
          requests: 1000
          unit: Second
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: example
spec:
  parentRefs:
  - name: eg
  hostnames:
  - www.example.com
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /foo
    filters:
    - type: ExtensionRef
      extensionRef:
        group: gateway.envoyproxy.io
        kind: RateLimitFilter
        name: ratelimit-all-requests
    backendRefs:
    - name: backend
      port: 3000

Rate limit per distinct value

Here is an example of a rate limit implemented by the application developer to limit any unique user by matching on a custom x-user-id header. Here, user A (recognised from the traffic flow using the header x-user-id and value a) will be rate limited at 10 requests/hour and so will user B (recognised from the traffic flow using the header x-user-id and value b).

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: ratelimit-per-user
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: example
  rateLimit:
    type: Global
    global:
      rules:
      - clientSelectors:
        - headers:
          - type: Distinct
            name: x-user-id
        limit:
          requests: 10
          unit: Hour
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: example
spec:
  parentRefs:
  - name: eg
  hostnames:
  - www.example.com
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /foo
    filters:
    - type: ExtensionRef
      extensionRef:
        group: gateway.envoyproxy.io
        kind: RateLimitFilter
        name: ratelimit-per-user 
    backendRefs:
    - name: backend
      port: 3000

Rate limit per source IP

Here is an example of a rate limit implemented by the application developer that limits the total requests made to a specific route by matching on source IP. In this case, requests from x.x.x.x will be rate limited at 10 requests/hour.

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: ratelimit-per-ip
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: example
  rateLimit:
    type: Global
    global:
      rules:
      - clientSelectors:
        - sourceIP: x.x.x.x/32
        limit:
          requests: 10
          unit: Hour
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: example
spec:
  parentRefs:
  - name: eg
  hostnames:
  - www.example.com
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /foo
    filters:
    - type: ExtensionRef
      extensionRef:
        group: gateway.envoyproxy.io
        kind: RateLimitFilter
        name: ratelimit-per-user 
    backendRefs:
    - name: backend
      port: 3000

Rate limit based on JWT claims

Here is an example of rate limit implemented by the application developer that limits the total requests made to a specific route by matching on the jwt claim. In this case, requests with jwt claim information of {"name":"John Doe"} will be rate limited at 10 requests/hour.

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: SecurityPolicy
metadata:
  name: jwt-example
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: example
  jwt:
    providers:
      - name: example
        remoteJWKS:
          uri: https://raw.githubusercontent.com/envoyproxy/gateway/main/examples/kubernetes/jwt/jwks.json
        claimToHeaders:
        - claim: name
          header: custom-request-header
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: ratelimit-specific-user
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: example
  rateLimit:
    type: Global
    global:
      rules:
      - clientSelectors:
        - headers:
          - name: custom-request-header
            value: John Doe
        limit:
          requests: 10
          unit: Hour
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: example
spec:
  parentRefs:
  - name: eg
  hostnames:
  - "www.example.com"
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: backend
      port: 3000
      weight: 1
    matches:
    - path:
        type: PathPrefix
        value: /foo

Multiple RateLimitFilters, rules and clientSelectors

Users can create multiple RateLimitFilters and apply it to the same HTTPRoute. In such a case each RateLimitFilter will be applied to the route and matched (and limited) in a mutually exclusive way, independent of each other.
Rate limits are applied for each RateLimitFilter rule when ALL the conditions under clientSelectors hold true.

Here’s an example highlighting this -

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: ratelimit-all-safeguard-app 
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: example
  rateLimit:
    type: Global
    global:
      rules:
      - limit:
          requests: 100
          unit: Hour
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: ratelimit-per-user
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: example
  rateLimit:
    type: Global
    global:
      rules:
      - clientSelectors:
        - headers:
          - type: Distinct
            name: x-user-id
        limit:
          requests: 100
          unit: Hour
---
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: example
spec:
  parentRefs:
  - name: eg
  hostnames:
  - www.example.com
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /foo
    filters:
    - type: ExtensionRef
      extensionRef:
        group: gateway.envoyproxy.io
        kind: RateLimitFilter
        name: ratelimit-per-user
    - type: ExtensionRef
      extensionRef:
        group: gateway.envoyproxy.io
        kind: RateLimitFilter
        name: ratelimit-all-safeguard-app
    backendRefs:
    - name: backend
      port: 3000

The user has created two RateLimitFilters and has attached it to a HTTPRoute - one(ratelimit-all-safeguard-app) to ensure that the backend does not get overwhelmed with requests, any excess requests are rate limited irrespective of the attributes within the traffic flow, and another(ratelimit-per-user) to rate limit each distinct user client who can be differentiated using the x-user-id header, to ensure that each client does not make excessive requests to the backend.
If user baz (identified with the header and value of x-user-id: baz) sends 90 requests within the first second, and user bar sends 11 more requests during that same interval of 1 second, and user bar sends the 101th request within that second, the rule defined in ratelimit-all-safeguard-app gets activated and Envoy Gateway will ratelimit the request sent by bar (and any other request sent within that 1 second). After 1 second, the rate limit counter associated with the ratelimit-all-safeguard-app rule is reset and again evaluated.
If user bar also ends up sending 90 more requests within the hour, summing up bar’s total request count to 101, the rate limit rule defined within ratelimit-per-user will get activated, and bar’s requests will be rate limited again until the hour interval ends.
Within the same above hour, if baz sends 991 more requests, summing up baz’s total request count to 1001, the rate limit rule defined within ratelimit-per-user will get activated for baz, and baz’s requests will also be rate limited until the hour interval ends.

Design Decisions

The initial design uses an Extension filter to apply the Rate Limit functionality on a specific HTTPRoute. This was preferred over the PolicyAttachment extension mechanism, because it is unclear whether Rate Limit will be required to be enforced or overridden by the platform administrator or not.
The RateLimitFilter can only be applied as a filter to a HTTPRouteRule, applying it across all backends within a HTTPRoute and cannot be applied a filter within a HTTPBackendRef for a specific backend.
The HTTPRoute API has a matches field within each rule to select a specific traffic flow to be routed to the destination backend. The RateLimitFilter API that can be attached to an HTTPRoute via an extensionRef filter, also has a clientSelectors field within each rule to select attributes within the traffic flow to rate limit specific clients. The two levels of selectors/matches allow for flexibility and aim to hold match information specific to its use, allowing the author/owner of each configuration to be different. It also allows the clientSelectors field within the RateLimitFilter to be enhanced with other matchable attribute such as IP subnet in the future that are not relevant in the HTTPRoute API.

Implementation Details

Global Rate limiting

Global rate limiting in Envoy Proxy can be achieved using the following -
- Actions can be configured per xDS Route.
- If the match criteria defined within these actions is met for a specific HTTP Request, a set of key value pairs called descriptors defined within the above actions is sent to a remote rate limit service, whose configuration (such as the URL for the rate limit service) is defined using a rate limit filter.
- Based on information received by the rate limit service and its programmed configuration, a decision is computed, whether to rate limit the HTTP Request or not, and is sent back to Envoy, which enforces this decision on the data plane.
Envoy Gateway will leverage this Envoy Proxy feature by -
- Translating the user facing RateLimitFilter API into Rate limit Actions as well as Rate limit service configuration to implement the desired API intent.
- Envoy Gateway will use the existing reference implementation of the rate limit service.
  - The Infrastructure administrator will need to enable the rate limit service using new settings that will be defined in the EnvoyGateway config API.
- The xDS IR will be enhanced to hold the user facing rate limit intent.
- The xDS Translator will be enhanced to translate the rate limit field within the xDS IR into Rate limit Actions as well as instantiate the rate limit filter.
- A new runner called rate-limit will be added that subscribes to the xDS IR messages and translates it into a new Rate Limit Infra IR which contains the rate limit service configuration as well as other information needed to deploy the rate limit service.
- The infrastructure service will be enhanced to subscribe to the Rate Limit Infra IR and deploy a provider specific rate limit service runnable entity.
- A Status field within the RateLimitFilter API will be added to reflect whether the specific configuration was programmed correctly in these multiple locations or not.

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.

Last modified February 3, 2025: add title for adopters (#5196) (da429be)