Retry
4 minute read
v1.3
, Envoy Gateway supports HTTPRoute Retries(GEP-1731),
this setting in the core Gateway API takes precedence over the BackendTrafficPolicy configuration.A retry setting specifies the maximum number of times an Envoy proxy attempts to connect to a service if the initial call fails. Retries can enhance service availability and application performance by making sure that calls don’t fail permanently because of transient problems such as a temporarily overloaded service or network. The interval between retries prevents the called service from being overwhelmed with requests.
Envoy Gateway supports the following retry settings:
- NumRetries: is the number of retries to be attempted. Defaults to 2.
- RetryOn: specifies the retry trigger condition.
- PerRetryPolicy: is the retry policy to be applied per retry attempt.
Envoy Gateway introduces a new CRD called BackendTrafficPolicy that allows the user to describe their desired retry settings. This instantiated resource can be linked to a Gateway, HTTPRoute or GRPCRoute resource.
Note: There are distinct circuit breaker counters for each BackendReference
in an xRoute
rule. Even if a BackendTrafficPolicy
targets a Gateway
, each BackendReference
in that gateway still has separate circuit breaker counter.
Prerequisites
Follow the steps below to install Envoy Gateway and the example manifest. Before proceeding, you should be able to query the example backend using HTTP.
Expand for instructions
Install the Gateway API CRDs and Envoy Gateway using Helm:
helm install eg oci://docker.io/envoyproxy/gateway-helm --version v1.3.0 -n envoy-gateway-system --create-namespace
Install the GatewayClass, Gateway, HTTPRoute and example app:
kubectl apply -f https://github.com/envoyproxy/gateway/releases/download/v1.3.0/quickstart.yaml -n default
Verify Connectivity:
You can also test the same functionality by sending traffic to the External IP. To get the external IP of the Envoy service, run:
export GATEWAY_HOST=$(kubectl get gateway/eg -o jsonpath='{.status.addresses[0].value}')
Note: In certain environments, the load balancer may be exposed using a hostname, instead of an IP address. If so, replace
ip
in the above command withhostname
.Curl the example app through Envoy proxy:
curl --verbose --header "Host: www.example.com" http://$GATEWAY_HOST/get
Get the name of the Envoy service created by the example Gateway:
export ENVOY_SERVICE=$(kubectl get svc -n envoy-gateway-system --selector=gateway.envoyproxy.io/owning-gateway-namespace=default,gateway.envoyproxy.io/owning-gateway-name=eg -o jsonpath='{.items[0].metadata.name}')
Port forward to the Envoy service:
kubectl -n envoy-gateway-system port-forward service/${ENVOY_SERVICE} 8888:80 &
Curl the example app through Envoy proxy:
curl --verbose --header "Host: www.example.com" http://localhost:8888/get
Test and customize retry settings
Before applying a BackendTrafficPolicy
with retry setting to a route, let’s test the default retry settings.
curl -v -H "Host: www.example.com" "http://${GATEWAY_HOST}/status/500"
It will return 500
response immediately.
* Trying 172.18.255.200:80...
* Connected to 172.18.255.200 (172.18.255.200) port 80
> GET /status/500 HTTP/1.1
> Host: www.example.com
> User-Agent: curl/8.4.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< date: Fri, 01 Mar 2024 15:12:55 GMT
< content-length: 0
<
* Connection #0 to host 172.18.255.200 left intact
Let’s create a BackendTrafficPolicy
with a retry setting.
The request will be retried 5 times with a 100ms base interval and a 10s maximum interval.
cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
name: retry-for-route
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: backend
retry:
numRetries: 5
perRetry:
backOff:
baseInterval: 100ms
maxInterval: 10s
timeout: 250ms
retryOn:
httpStatusCodes:
- 500
triggers:
- connect-failure
- retriable-status-codes
EOF
Save and apply the following resource to your cluster:
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
name: retry-for-route
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: backend
retry:
numRetries: 5
perRetry:
backOff:
baseInterval: 100ms
maxInterval: 10s
timeout: 250ms
retryOn:
httpStatusCodes:
- 500
triggers:
- connect-failure
- retriable-status-codes
Execute the test again.
curl -v -H "Host: www.example.com" "http://${GATEWAY_HOST}/status/500"
It will return 500
response after a few while.
* Trying 172.18.255.200:80...
* Connected to 172.18.255.200 (172.18.255.200) port 80
> GET /status/500 HTTP/1.1
> Host: www.example.com
> User-Agent: curl/8.4.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< date: Fri, 01 Mar 2024 15:15:53 GMT
< content-length: 0
<
* Connection #0 to host 172.18.255.200 left intact
Let’s check the stats to see the retry behavior.
egctl x stats envoy-proxy -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=eg,gateway.envoyproxy.io/owning-gateway-namespace=default | grep "envoy_cluster_upstream_rq_retry{envoy_cluster_name=\"httproute/default/backend/rule/0\"}"
You will expect to see the stats.
envoy_cluster_upstream_rq_retry{envoy_cluster_name="httproute/default/backend/rule/0"} 5
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.