In part I of this series, we introduced the Kubernetes networking model and how IP packets traverse between Pods.
We also talked about how a Kubernetes Service exposes an application via a single IP address abstracting the encompassing Pods.
There are four types of services in Kubernetes.
ClusterIP
NodePort
LoadBalancer
ExternalName
We introduced the first two types of Services in Part I, and discussed how a NodePort Service exposes an application running inside a Kubernetes cluster to clients outside the cluster.
While it’s simple to implement, a NodePort Service has three main drawbacks.
Load sharing becomes the responsibility of the client
A NodePort Service is available as
<node-ip:port>
from all nodes in the cluster. Clients must share the traffic between the nodes. Or else some of the nodes could be congested. Also, the clients must be aware of node availability to exclude faulty or unavailable nodes. So, NodePort service cannot practically expose a service to thousands of end users.Nodes get exposed to external networks
For accessing a NodePort Service, the IP addresses of the worker nodes must be reachable to clients outside the Kubernetes cluster. So, it’s impossible to deliver an application over the Internet via a NodePort Service. Even within a secure campus network, making the nodes directly accessible to clients is a security risk.
Clients must talk to a non-standard port
Kubernetes assigns a random port to a NodePort Service from a range of pre-configured ports (30,000 to 32,767 by default). Since each NodePort Service requires a unique port, it’s impossible to retain the standard port when exposing the application to clients. Using a non-standard port is cumbersome and unacceptable when delivering multiple applications to thousands of users.
LoadBalancer Services
LoadBalancer is a type of a Kubernetes Service that overcomes the limitations in NodePort. The Service type LoadBalancer needs a load balancer integrated with the Kubernetes cluster via a load balancer controller.
The load balancer controller watches the Kubernetes API for events related to Kubernetes Services such as Service creation, Replica Set addition, etc., and updates the traffic routing rules in the load balancer.
Most load balancers reside outside the cluster but MetalLB which is an open-source load balancer for Kubernetes is deployed inside the cluster.
Load balancer outside the cluster
Most commercial load balancer appliances as well as virtual load balancer services from the leading cloud providers reside outside the Kubernetes cluster.
Load balancer controller often runs in a Pod within the cluster but can also be located outside the cluster as long as it has network reachability to Kubernetes control plane.
When running Kubernetes in public clouds the load balancer controller is an add-on which you must enable in your clusters. The controller add-on provisions a load balancer for the cluster whenever you create a Service of type LoadBalancer.
In on-premise Kubernetes clusters, you must separately integrate a load balancer solution.
A load balancer located outside the cluster can distribute traffic to target pods via a ClusterIP Service or a NodePort Service.
Distributing traffic via ClusterIP Service
To distribute traffic to a ClusterIP Service, the Pod IP addresses must be reachable without NAT to the load balancer.
Most cloud providers’ CNI plugins and open-source CNI plugins like Calico can make the Pod IP addresses routable to enable the load balancer to distribute traffic directly to Pods. The mechanism of making Pod IP addresses routable is dependent on the particular plugin and we will talk about it in future.
Making Pod IPs routable has one draw back. With many large clusters with thousands of Pods you could exhaust the private IP address usage in your organization.
In such cases you could use a NodePort service.
Distributing traffic via NodePort Service
Instead of using a ClusterIP service, a load balancer can use a NodePort Service to distribute traffic to target Pods.
The Pod IP addresses need not be reachable to the load balancer as the nodes act as the load balancer targets.
The graphic below shows a load balancer distributing traffic to two NodePort services; Service-1
via port 32060 and Service-2
via port 32070.
The load balancer exposes the two applications to clients with two unique IP addresses via the standard HTTP port so that we overcome the drawbacks of exposing a NodePort Service directly to clients.
We have described two methods where a load balancer outside the cluster distributes traffic to target applications inside the cluster.
Next, we are going to talk about MetalLB which is a load balancer inside the cluster.
MetalLB
MetalLB is an open-source load balancer for Kubernetes. Some Kubernetes distributions include MetalLB as an add-on. If not, you can install MetalLB by following the instructions here.
MetalLB consists of a controller
which is a Kubernetes Deployment and a speaker
which is a Kubernetes DaemonSet.
Metallb can load balance in either layer 2 or layer 3 mode.
MetalLB - Layer 2 mode
In layer 2 mode, MetalLB selects one node in the cluster as the leader node. The leader node responds to ARP requests for Kubernetes Service IP address.
The client sends all traffic destined to the Service to the leader node. The leader node is responsible for load sharing the traffic to Pods belonging to the service. The kube-proxy handles this load sharing as described in Kubernetes Services in the first article in this series.
If the leader node fails, another node assumes the responsibility of the leader node.
The layer 2 mode is simple to implement. But, routing all traffic through a single node could congest the node at high traffic.
MetalLB - Layer 3 mode
In layer 3 mode, MetalLB creates BGP sessions with a gateway router from each node in the cluster. The nodes that run the workload Pods advertise the Service IP address and the gateway router load share the traffic across the corresponding nodes.
The layer 3 mode overcomes the scalability limitations in layer-2 mode but, has one drawback.
The BGP-based load sharing mechanism is stateless in most routers. A router manages the load sharing destinations in a hash table where the hash is calculated from header parameters of the IP packet.
At an event like adding or removing a new Pod the router rebuilds the hash table causing the ongoing traffic to land on different Pods. The clients will experience a monentarily disconnetion. But, considering the ephemaral nature of Pods, such dicsonnections could not be acceptable.
Scalability of LoadBalancer Services
The Service type LoadBalancer is the most scalable method of exposing applications in Kubernetes to clients outside. You could use NodePort type for specific use cases, but you need a load balancer to scale your applications to thousands of users.
Kubernetes Ingress is also method of exposing HTTP-based applications to external clients. We will talk about it in an upcoming post.