Accelerating TLS Handshakes using Private Key Provider in Envoy
8 minute read
TLS operations can be accelerated or the private key can be protected using specialized hardware. This can be leveraged in Envoy using Envoy Private Key Provider is added to Envoy.
Today, there are two private key providers implemented in Envoy as contrib extensions:
Both of them are used to accelerate the TLS handshake through the hardware capabilities.
This task will walk you through the steps required to configure TLS Termination mode for TCP traffic while also using the Envoy Private Key Provider to accelerate the TLS handshake by leveraging QAT and the HW accelerator available on Intel SPR/EMR Xeon server platforms.
Prerequisites
For QAT
Install Linux kernel 5.17 or similar
Ensure the node has QAT devices by checking the QAT physical function devices presented. Supported Devices
echo `(lspci -d 8086:4940 && lspci -d 8086:4941 && lspci -d 8086:4942 && lspci -d 8086:4943 && lspci -d 8086:4946 && lspci -d 8086:4947) | wc -l` supported devices found.
Enable IOMMU from BIOS
Enable IOMMU for Linux kernel
Figure out the QAT VF device id
lspci -d 8086:4941 && lspci -d 8086:4943 && lspci -d 8086:4947
Attach the QAT device to vfio-pci through kernel parameter by the device id gotten from previous command.
cat /etc/default/grub: GRUB_CMDLINE_LINUX="intel_iommu=on vfio-pci.ids=[QAT device id]" update-grub reboot
Once the system is rebooted, check if the IOMMU has been enabled via the following command:
dmesg| grep IOMMU [ 1.528237] DMAR: IOMMU enabled
Enable virtual function devices for QAT device
modprobe vfio_pci rmmod qat_4xxx modprobe qat_4xxx qat_device=$(lspci -D -d :[QAT device id] | awk '{print $1}') for i in $qat_device; do echo 16|sudo tee /sys/bus/pci/devices/$i/sriov_numvfs; done chmod a+rw /dev/vfio/*
Increase the container runtime memory lock limit (using the containerd as example here)
mkdir /etc/systemd/system/containerd.service.d cat <<EOF >>/etc/systemd/system/containerd.service.d/memlock.conf [Service] LimitMEMLOCK=134217728 EOF
Restart the container runtime (for containerd, CRIO has similar concept)
systemctl daemon-reload systemctl restart containerd
Install Intel® QAT Device Plugin for Kubernetes
kubectl apply -k 'https://github.com/intel/intel-device-plugins-for-kubernetes/deployments/qat_plugin?ref=main'
Verification of the plugin deployment and detection of QAT hardware can be confirmed by examining the resource allocations on the nodes:
kubectl get node -o yaml| grep qat.intel.com
For CryptoMB:
It required the node with 3rd generation Intel Xeon Scalable processor server processors, or later.
For kubernetes Cluster, if not all nodes that support Intel® AVX-512 in Kubernetes cluster, you need to add some labels to divide these two kinds of nodes manually or using NFD.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.1
Checking the available nodes with required cpu instructions:
Check the node labels if using NFD:
kubectl get nodes -l feature.node.kubernetes.io/cpu-cpuid.AVX512F,feature.node.kubernetes.io/cpu-cpuid.AVX512DQ,feature.node.kubernetes.io/cpu-cpuid.AVX512BW,feature.node.kubernetes.io/cpu-cpuid.AVX512VBMI2,feature.node.kubernetes.io/cpu-cpuid.AVX512IFMA
Check CPUIDS manually on the node if without using NFD:
cat /proc/cpuinfo |grep avx512f|grep avx512dq|grep avx512bw|grep avx512_vbmi2|grep avx512ifma
Installation
Follow the steps from the Quickstart to install Envoy Gateway.
Enable the EnvoyPatchPolicy feature, which will allow us to directly configure the Private Key Provider Envoy Filter, since Envoy Gateway does not directly expose this functionality.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: envoy-gateway-config
namespace: envoy-gateway-system
data:
envoy-gateway.yaml: |
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyGateway
gateway:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
extensionApis:
enableEnvoyPatchPolicy: true
EOF
Save and apply the following resource to your cluster:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: envoy-gateway-config
namespace: envoy-gateway-system
data:
envoy-gateway.yaml: |
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyGateway
gateway:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
extensionApis:
enableEnvoyPatchPolicy: true
After updating the
ConfigMap
, you will need to restart theenvoy-gateway
deployment so the configuration kicks inkubectl rollout restart deployment envoy-gateway -n envoy-gateway-system
Create gateway for TLS termination
Follow the instructions in TLS Termination for TCP to setup a TCP gateway to terminate the TLS connection.
Update GatewayClass for using the envoyproxy image with contrib extensions and requests required resources.
cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: eg
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
parametersRef:
group: gateway.envoyproxy.io
kind: EnvoyProxy
name: custom-proxy-config
namespace: envoy-gateway-system
EOF
Save and apply the following resource to your cluster:
---
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: eg
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
parametersRef:
group: gateway.envoyproxy.io
kind: EnvoyProxy
name: custom-proxy-config
namespace: envoy-gateway-system
Change EnvoyProxy configuration for QAT
Using the envoyproxy image with contrib extensions and add qat resources requesting, ensure the k8s scheduler find out a machine with required resource.
cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
name: custom-proxy-config
namespace: envoy-gateway-system
spec:
concurrency: 1
provider:
type: Kubernetes
kubernetes:
envoyService:
type: NodePort
envoyDeployment:
container:
image: envoyproxy/envoy-contrib-dev:latest
resources:
requests:
cpu: 1000m
memory: 4096Mi
qat.intel.com/cy: '1'
limits:
cpu: 1000m
memory: 4096Mi
qat.intel.com/cy: '1'
EOF
Save and apply the following resource to your cluster:
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
name: custom-proxy-config
namespace: envoy-gateway-system
spec:
concurrency: 1
provider:
type: Kubernetes
kubernetes:
envoyService:
type: NodePort
envoyDeployment:
container:
image: envoyproxy/envoy-contrib-dev:latest
resources:
requests:
cpu: 1000m
memory: 4096Mi
qat.intel.com/cy: '1'
limits:
cpu: 1000m
memory: 4096Mi
qat.intel.com/cy: '1'
Change EnvoyProxy configuration for CryptoMB
Using the envoyproxy image with contrib extensions and add the node affinity to scheduling the Envoy Gateway pod on the machine with required CPU instructions.
cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
name: custom-proxy-config
namespace: envoy-gateway-system
spec:
concurrency: 1
provider:
type: Kubernetes
kubernetes:
envoyService:
type: NodePort
envoyDeployment:
container:
image: envoyproxy/envoy-contrib-dev:latest
resources:
requests:
cpu: 1000m
memory: 4096Mi
limits:
cpu: 1000m
memory: 4096Mi
pod:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: feature.node.kubernetes.io/cpu-cpuid.AVX512F
operator: Exists
- key: feature.node.kubernetes.io/cpu-cpuid.AVX512DQ
operator: Exists
- key: feature.node.kubernetes.io/cpu-cpuid.AVX512BW
operator: Exists
- key: feature.node.kubernetes.io/cpu-cpuid.AVX512IFMA
operator: Exists
- key: feature.node.kubernetes.io/cpu-cpuid.AVX512VBMI2
operator: Exists
EOF
Save and apply the following resource to your cluster:
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
name: custom-proxy-config
namespace: envoy-gateway-system
spec:
concurrency: 1
provider:
type: Kubernetes
kubernetes:
envoyService:
type: NodePort
envoyDeployment:
container:
image: envoyproxy/envoy-contrib-dev:latest
resources:
requests:
cpu: 1000m
memory: 4096Mi
limits:
cpu: 1000m
memory: 4096Mi
pod:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: feature.node.kubernetes.io/cpu-cpuid.AVX512F
operator: Exists
- key: feature.node.kubernetes.io/cpu-cpuid.AVX512DQ
operator: Exists
- key: feature.node.kubernetes.io/cpu-cpuid.AVX512BW
operator: Exists
- key: feature.node.kubernetes.io/cpu-cpuid.AVX512IFMA
operator: Exists
- key: feature.node.kubernetes.io/cpu-cpuid.AVX512VBMI2
operator: Exists
Or using preferredDuringSchedulingIgnoredDuringExecution
for best effort scheduling, or not doing any node affinity, just doing the random scheduling. The CryptoMB private key provider supports software fallback if the required CPU instructions aren’t here.
Apply EnvoyPatchPolicy to enable private key provider
Benchmark before enabling private key provider
First follow the instructions in TLS Termination for TCP to do the functionality test.
Ensure the cpu frequency governor set as performance
.
export NUM_CPUS=`lscpu | grep "^CPU(s):"|awk '{print $2}'`
for i in `seq 0 1 $NUM_CPUS`; do sudo cpufreq-set -c $i -g performance; done
Using the nodeport as the example, fetch the node port from envoy gateway service.
export ENVOY_SERVICE=$(kubectl get svc -n envoy-gateway-system --selector=gateway.envoyproxy.io/owning-gateway-namespace=default,gateway.envoyproxy.io/owning-gateway-name=eg -o jsonpath='{.items[0].metadata.name}')
export NODE_PORT=$(kubectl -n envoy-gateway-system get svc/$ENVOY_SERVICE -o jsonpath='{.spec.ports[0].nodePort}')
echo "127.0.0.1 www.example.com" >> /etc/hosts
Benchmark the gateway with fortio.
fortio load -c 10 -k -qps 0 -t 30s -keepalive=false https://www.example.com:${NODE_PORT}
For QAT
cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyPatchPolicy
metadata:
name: key-provider-patch-policy
namespace: default
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: eg
type: JSONPatch
jsonPatches:
- type: "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"
name: default/example-cert
operation:
op: add
path: "/tls_certificate/private_key_provider"
value:
provider_name: qat
typed_config:
"@type": "type.googleapis.com/envoy.extensions.private_key_providers.qat.v3alpha.QatPrivateKeyMethodConfig"
private_key:
inline_string: |
abcd
poll_delay: 0.001s
fallback: true
- type: "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"
name: default/example-cert
operation:
op: copy
from: "/tls_certificate/private_key"
path: "/tls_certificate/private_key_provider/typed_config/private_key"
EOF
Save and apply the following resource to your cluster:
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyPatchPolicy
metadata:
name: key-provider-patch-policy
namespace: default
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: eg
type: JSONPatch
jsonPatches:
- type: "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"
name: default/example-cert
operation:
op: add
path: "/tls_certificate/private_key_provider"
value:
provider_name: qat
typed_config:
"@type": "type.googleapis.com/envoy.extensions.private_key_providers.qat.v3alpha.QatPrivateKeyMethodConfig"
private_key:
inline_string: |
abcd
poll_delay: 0.001s
fallback: true
- type: "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"
name: default/example-cert
operation:
op: copy
from: "/tls_certificate/private_key"
path: "/tls_certificate/private_key_provider/typed_config/private_key"
For CryptoMB
cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyPatchPolicy
metadata:
name: key-provider-patch-policy
namespace: default
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: eg
type: JSONPatch
jsonPatches:
- type: "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"
name: default/example-cert
operation:
op: add
path: "/tls_certificate/private_key_provider"
value:
provider_name: cryptomb
typed_config:
"@type": "type.googleapis.com/envoy.extensions.private_key_providers.cryptomb.v3alpha.CryptoMbPrivateKeyMethodConfig"
private_key:
inline_string: |
abcd
poll_delay: 0.001s
fallback: true
- type: "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"
name: default/example-cert
operation:
op: copy
from: "/tls_certificate/private_key"
path: "/tls_certificate/private_key_provider/typed_config/private_key"
EOF
Save and apply the following resource to your cluster:
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyPatchPolicy
metadata:
name: key-provider-patch-policy
namespace: default
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: eg
type: JSONPatch
jsonPatches:
- type: "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"
name: default/example-cert
operation:
op: add
path: "/tls_certificate/private_key_provider"
value:
provider_name: cryptomb
typed_config:
"@type": "type.googleapis.com/envoy.extensions.private_key_providers.cryptomb.v3alpha.CryptoMbPrivateKeyMethodConfig"
private_key:
inline_string: |
abcd
poll_delay: 0.001s
fallback: true
- type: "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"
name: default/example-cert
operation:
op: copy
from: "/tls_certificate/private_key"
path: "/tls_certificate/private_key_provider/typed_config/private_key"
Benchmark after enabling private key provider
First follow the instructions in TLS Termination for TCP to do the functionality test again.
Benchmark the gateway with fortio.
fortio load -c 64 -k -qps 0 -t 30s -keepalive=false https://www.example.com:${NODE_PORT}
You will see a performance boost after private key provider enabled. For example, you will get results as below.
Without private key provider:
All done 43069 calls (plus 10 warmup) 6.966 ms avg, 1435.4 qps
With CryptoMB private key provider, the QPS is over 2 times than without private key provider.
All done 93983 calls (plus 128 warmup) 40.880 ms avg, 3130.5 qps
With QAT private key provider, the QPS is over 3 times than without private key provider
All done 134746 calls (plus 128 warmup) 28.505 ms avg, 4489.6 qps
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.