Network emulation (netem) CNI-plugin for Kubernetes
Facts
- Git Repo: https://github.com/ERIGrid2/k8s-netem
- Helm Chart: https://github.com/ERIGrid2/charts/tree/master/charts/netem
- State: under testing
Introduction
k8s-netem adds support for traffic control in Kubernetes pods.
It allows the user to configure one or more traffic profiles to impair network traffic between pods in the cluster and between pods and external networks.
The traffic profiles are implemented as a custom resource definitions (CRD) which the user can add and modify in the Kubernetes database using standard tools like kubectl
or a Kubernetes web-interface.
These, TrafficProfiles can use a spec.podSelector
or CIDR's to match a set of source and destination pods/networks for which the impairment should be configured.
In addition the impairment can be restricted to a set of UDP or TCP port numbers, Ether-types and IP protocols.
k8s-netem
will continuously watch for new or modified TrafficProfiles as well as Pods and update the traffic control configuration appropriately.
The traffic profile custom resource is inspired by Kubernetes NetworkPolicy CR and has been extended to accomodate the traffic control parameters as well as additional filters for the Ether-types and IP protocols.
Features
- Network emulation and rate limiting
- Support for ingress (WIP) and egress traffic
- Requires no modification of existing manifests
- Complex ingress/egress filters inspired by Kubernetes' network policies
- Matches cross-pod flows as well as from/to specific external CIDRs
- Matches UDP/TCP ports
- Matches Ether-types
- Matches IP Protocols
- Live filter updates based on
podSelectors
- Support for multiple TrafficProfiles per Pod
- Extensible with additional controller types
Employed technologies
- Linux:
- Kubernetes:
Controllers
Currently k8s-netem supports two types of controllers:
Builtin
The builtin TC controller uses iproute2's tc
command to configure Linux's traffic control subsystem by adding queuing disciplines and filters.
Flexe (VTT Network Emulator)
The Flexe controller is software developed by VTT, which is based on Linux iproute2's tc
command like builtin TC controller, but has more functionalities added to it. Software have been extended to support have several traffic profiles running from time to time, added REST API for making the qdisc / filter / netem configuration changes, etc.
Example TrafficProfile for Flexe controller
---
apiVersion: k8s-netem.riasc.eu/v1
kind: TrafficProfile
metadata:
name: profile-delay-jitter-flexe
spec:
podSelector:
matchLabels:
traffic-profile: profile-delay-jitter-flexe
type: Flexe
parameters:
segments:
- repeat: True
profiles:
- name: ethernet
parameters:
runTime: 30
bandwidthUp: 100000
bandwidthDown: 100000
delay: 0.25
delayVariation: 0.25
- name: 3g
parameters:
runTime: 30
bandwidthUp: 256
bandwidthDown: 256
delay: 200
delayVariation: 50
loss: 0.5
duplication: 0.1
corruption: 0.1
reorder: 0.2
egress:
- to:
- ipBlock:
cidr: 1.1.1.1/32
- podSelector:
matchLabels:
component: example
ports:
- port: 443
protocol: TCP
- port: 53
protocol: UDP
- to:
- ipBlock:
cidr: 8.8.8.8/32
- ports:
- port: 80
protocol: tcp
Example TrafficProfile for Builtin controller
---
apiVersion: k8s-netem.riasc.eu/v1
kind: TrafficProfile
metadata:
name: profile-builtin
spec:
podSelector:
matchLabels:
traffic-profile: builtin
type: Builtin
parameters:
netem:
delay: 0.2 # seconds
loss_ratio: 0.2 # in [0, 1]
egress:
- to:
- ipBlock:
cidr: 1.1.1.1/32
- podSelector:
matchLabels:
component: example
inetProtos:
- icmp
Example Pod
apiVersion: v1
kind: Pod
metadata:
labels:
component: example
traffic-profile: builtin
name: example-pod-1
spec:
containers:
- command:
- ping
- 1.1.1.1
image: nicolaka/netshoot
name: ping-cloudflare
securityContext:
capabilities:
add:
- NET_ADMIN
Approach
- User creates a new TrafficProfile CR
- User creates one or more Pods which match the
podSelector
of the TrafficProfile CR - A mutating admission webhook will inject a Sidecar container into the newly created Pods
- The sidecar container will configure the network traffic controller by:
- Watching for new/modified TrafficProfile matching the
podSelector
- Watching for new/modified Pods which match the ingress/egress peers
podSelector
s- New matching Pods will be added to IPsets
- Previously matching Pods which have been deleted will be removed from the IPsets.
- Configuring the traffic impairment by configuring one or more netem Qdiscs and attaching them to their dedicated IPsets filters.
- Watching for new/modified TrafficProfile matching the
Implementation details
Installation
k8s-netem can be deployed by a dedicated k8s-netem
Helm chart.
Custom Resources
k8s-netem defines a new CRD k8s-netem.riasc.eu/v1/trafficprofiles
.
Mutating Admission Webhook
The mutating admission webhooks gets invoked by the Kubernetes API server for each created, modified or deleted Pod resource.
The webhook will check if any of the existing TrafficProfiles targets the Pod. If this is the case, an additional sidecar container will be injected into the Pod.
Note: Currently, the webhook will only inject the sidecar if the TrafficProfile already exists at the time of the Pod creation or update. k8s-netem will not re-create Pods after a new TrafficProfile is added to the cluster. It is the responsibility to re-create Pods in order for the side-cards to be injected.
Sidecar Containers
The sidecar container will run alongside the user containers for the full life-cycle of the Pod. It is tasked with the synchronization of TrafficProfiles with the Kernel TC / IPset data structures.
This means, modifications of existing TrafficProfiles by the user (e.g. to adjust impairment parameters) are synced to the Linux kernel configuration.
At the same time the sidecar container will watch for new or deleted Pods which match the ingres/egress peer podSelectors and add their podIPs to the respective IPsets which are used by the TC filters.
Flow classification
k8s-netem
uses NFtables to classify network traffic flows based on the spec.egress
and spec.ingress
filters.
Updates to these NFTables are easy as CIDRs, Ports, Ether-types and IP protocols used in the filters are stored in NFTables sets which can easily be manipulated without affecting the other filters.
Each TrafficProfile allocate a dedicated firewall mark (fwmark
) from a per-Pod pool of marks.
The NFTables rule will mark the selected traffic flows using this mark which later is used by a TC filter.
Support for multiple simultaneous TrafficProfiles, Controllers and Interfaces per Pod
k8s-netem
supports multiple TrafficProfiles matching the same Pod using the spec.podSelector
.
Each profile can also target a specific interface within the container by using a regular expression in spec.interfaceFilter
.
Multiple TrafficProfiles targeting the same interface within the same Pod are supported as long as they share the spec.type
.