17

kube-proxy has an option called --proxy-mode,and according to the help message, this option can be userspace or iptables.(See below)

# kube-proxy -h
Usage of kube-proxy:
...
      --proxy-mode="": Which proxy mode to use: 'userspace' (older, stable) or 'iptables' (experimental). If blank, look at the Node object on the Kubernetes API and respect the 'net.experimental.kubernetes.io/proxy-mode' annotation if provided.  Otherwise use the best-available proxy (currently userspace, but may change in future versions).  If the iptables proxy is selected, regardless of how, but the system's kernel or iptables versions are insufficient, this always falls back to the userspace proxy.
...

I can't figure out what does userspace mode means here.

Anyone can tell me what the working principle is when kube-proxy runs under userspace mode?

cizixs
  • 9,843
  • 5
  • 43
  • 58
ax003d
  • 3,168
  • 26
  • 25

1 Answers1

61

Userspace and iptables refer to what actually handles the connection forwarding. In both cases, local iptables rules are installed to intercept outbound TCP connections that have a destination IP address associated with a service.

In the userspace mode, the iptables rule forwards to a local port where a go binary (kube-proxy) is listening for connections. The binary (running in userspace) terminates the connection, establishes a new connection to a backend for the service, and then forwards requests to the backend and responses back to the local process. An advantage of the userspace mode is that because the connections are created from an application, if the connection is refused, the application can retry to a different backend.

In iptables mode, the iptables rules are installed to directly forward packets that are destined for a service to a backend for the service. This is more efficient than moving the packets from the kernel to kube-proxy and then back to the kernel so it results in higher throughput and better tail latency. The main downside is that it is more difficult to debug, because instead of a local binary that writes a log to /var/log/kube-proxy you have to inspect logs from the kernel processing iptables rules.

In both cases there will be a kube-proxy binary running on your machine. In userspace mode it inserts itself as the proxy; in iptables mode it will configure iptables rather than to proxy connections itself. The same binary works in both modes, and the behavior is switched via a flag or by setting an annotation in the apiserver for the node.

Robert Bailey
  • 16,541
  • 2
  • 46
  • 52
  • 3
    ax003d you should accept the answer if it's satisfactory. – smparkes Mar 19 '16 at 18:11
  • I also understood that the userspace mode is able to do retries should a pod be unavailable, whereas in iptables mode it's just statistical load balancing? If that's correct, would it be worth mentioning as one of the key differences? – Timo Reimann Dec 18 '16 at 09:13
  • 1
    @TimoReimann - that is true. In the iptables mode you can end up black-holing some traffic if the set of endpoints is out of date. – Robert Bailey Dec 19 '16 at 09:08
  • _An advantage of the userspace mode is that because the connections are created from an application, if the connection is refused, the application can retry to a different backend._ **What exactly is application here?** _The binary (running in userspace) terminates the connection_ **Can you expand on this please?** – Nick Jan 04 '18 at 18:03
  • 1
    The application is the code running inside a container in Kubernetes that is connecting to another container via a service address. If you had a web server and a database, and the web server was connecting to the database via a service called `backend`, then the web server would be the application. – Robert Bailey Jan 05 '18 at 15:52
  • 2
    The binary is the kube-proxy binary and it runs in user space (as a container on each node in the cluster). Terminating the connections in user space is less efficient than letting the kernel rewrite the packet's destination addresses because the packets traverse the user space / kernel boundary multiple times before being sent out on the wire. – Robert Bailey Jan 05 '18 at 15:54
  • Does the kube-proxy binary run in the "pause" container? When I do docker ps, I don't see it there? – Nick Jan 06 '18 at 15:36
  • When operating in userspace mode, you say that __the application can retry to a different backend__, but is the retrying done automatically by the kube-proxy? Because, you said that the application is my web server, but I don't think my webserver is in charge of handling the retry to a different pod backend. – Nick Jan 06 '18 at 15:44
  • 3
    kube-proxy is not in the pause binary. The pause binary holds the network namespace for all containers that share the same pod. kube-proxy either runs as a stand-alone binary or inside of a container, depending on your distribution of Kubernetes. – Robert Bailey Jan 06 '18 at 17:12
  • 2
    Yes, it's the proxy that can retry a different backend (without the application code needing to change). Because the user space proxy can detect that a connection was not possible, it can try a different backend. With iptables, the packets get rewritten but nothing is checking that they make it to a destination, so any retries would need to be done at the application layer. – Robert Bailey Jan 06 '18 at 17:13