Kubernetes Services
This document assumes you already read
Kubernetes
Now that we know how to
build containers and deploy them to
Kubernetes we want to access the containers and provide a service to the outside world.
Our great tutorial service will be a cgi script that tells us a few things via http.
Tutorial Setup
As of writing (2020-04) the only client system is
asl503
. And please remember to adjust all the "my-" to something meaningful.
To interact with our service from inside kubernetes we will need an interactive shell in the same namespace.
Kubernetes Shell
Let's create a pod
my-shell.yaml
apiVersion: v1
kind: Pod
metadata:
name: my-shell
labels:
app: my-tutorial
spec:
containers:
- name: busybox
image: busybox:1.28
stdin: true
tty: true
command: ['sh']
run it, attach to it, set prompt so you can recognize it. Leave this open. The tutorial will reference it as kubeshell.
[handel@vmlb016 ~]$ kubectl apply -f my-shell.yaml
pod/my-shell created
[handel@vmlb016 ~]$ kubectl attach -ti my-shell
Defaulting container name to busybox.
Use 'kubectl describe pod/my-shell -n acoap' to see all of the containers in this pod.
If you don't see a command prompt, try pressing enter.
/ # export PS1='kubeshell # '
kubeshell #
Note: there is a bug in busybox > 1.28 which breaks nslookup
https://github.com/kubernetes/kubernetes/issues/66924
We will do a lot of nslookup later in this tutorial, so stay on 1.28. Or type
nslookup -type=a question
http echo
Let's start a http server that provides us with some information. Create descriptor
my-echo.yaml
apiVersion: v1
kind: Pod
metadata:
name: my-echo
labels:
app: my-tutorial
svc: my-echo
spec:
containers:
- name: http-echo
image: aco/http-echo:stable
That is a pod with two labels (app and svc), that runs container http-echo with the default command specified in the container.
[handel@vmlb016 ~]$ kubectl apply -f my-echo.yaml
pod/my-echo created
check that it is running
[handel@vmlb016 ~]$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-echo 1/1 Running 0 9s 10.253.1.28 vmlb042
...
ok it is running and has the ip 10.253.1.28. This is a kubernetes pod ip address and not accessible from the outside. But we have our kubernetes shell
kubeshell # wget -q -O - http://10.253.1.28/cgi-bin/index.cgi | grep HOSTNAME
HOSTNAME=my-echo
delete it, redeploy it, check the ip
[handel@vmlb016 ~]$ kubectl delete pod my-echo; kubectl apply -f my-echo.yaml; sleep 5; kubectl get pod my-echo -o wide
pod "my-echo" deleted
pod/my-echo created
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-echo 1/1 Running 0 5s 10.253.1.29 vmlb042
the ip has changed. so using the ip to access the http service is not good. Maybe the name is better?
kubeshell # wget -q -O - http://my-echo/cgi-bin/index.cgi | grep HOSTNAME
wget: bad address 'my-echo'
No. Doesn't work.
Service
A service is a kubernetes internal named endpoint to access pods.
Create
Write a descriptor
my-service.yaml
apiVersion: v1
kind: Service
metadata:
name: my-service
labels:
app: my-tutorial
spec:
type: ClusterIP
clusterIP: None
selector:
svc: my-echo
ports:
- protocol: TCP
port: 80
The creative name "my-service" will point to any pod that has the label svc=my-echo and exposes tcp port 80. The service itself is part of the app my-tutorial.
A definition without clusterip is called headless service.
Apply it
[handel@vmlb016 ~]$ kubectl apply -f my-service.yaml
service/my-service created
[handel@vmlb016 ~]$ kubectl get service -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
my-service ClusterIP None 80/TCP 16s svc=my-echo
Let's see if kubeshell can access it
kubeshell # wget -q -O - http://my-service/cgi-bin/index.cgi | grep HOSTNAME
HOSTNAME=my-echo
it can.
To inspect what is connected to our service we can get a service description
[handel@vmlb016 ~]$ kubectl describe service my-service
Name: my-service
Namespace: acoap
Labels: app=my-tutorial
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"my-tutorial"},"name":"my-service","namespace":"acoap"},"...
Selector: svc=my-echo
Type: ClusterIP
IP: None
Port: <unset> 80/TCP
TargetPort: 80/TCP
Endpoints: 10.253.1.29:80
Session Affinity: None
Events: <none>
The service has no IP, as we specified that we don't want one by providing None
It uses Port 80, but has not given the port a name like http.
it points to port 80 on 10.253.1.29, which is our pods IP address.
DNS
Wait, no IP address but we could access it? Yes, we simply point a DNS record to the IP address of the pod. If the pod changes it update the dns record.
The accelerator uses the subdomain
acc.gsi.de
, Kubernetes internaly uses the subdomain
k8s.acc.gsi.de
, kubernetes clusters use a letter
a.k8s.acc.gsi.de
The schema to construct a dns record for a services is
servicename.
namespace.svc.
cluster.k8s.acc.gsi.de in this example we end up in
my-service.
acoap.svc.
a.k8s.acc.gsi.de
our kubeshell is in the same subdomain and will find my-service with only the my-service as a shortname. It tests multiple subdomains appending them in turn until it finds a match
kubeshell # cat /etc/resolv.conf
search acoapp.svc.a.k8s.acc.gsi.de svc.a.k8s.acc.gsi.de k8s.acc.gsi.de acc.gsi.de
If you want to access any service inside the same namespace, always use the short name, then you are portable to other namespaces.
Let's check the name resolution
kubeshell # nslookup -type=a my-service
Server: 10.253.128.10
Address 1: 10.253.128.10 kube-dns.kube-system.svc.a.k8s.acc.gsi.de
Name: my-service
Address 1: 10.253.1.29 10-253-1-29.my-service.acoap.svc.a.k8s.acc.gsi.de
kubeshell # nslookup -type=a my-service.acoapp
Server: 10.253.128.10
Address 1: 10.253.128.10 kube-dns.kube-system.svc.a.k8s.acc.gsi.de
Name: my-service.acoapp
Address 1: 10.253.1.29 10-253-1-29.my-service.acoapp.svc.a.k8s.acc.gsi.de
kubeshell # nslookup -type=a my-service.acoapp.svc.a.k8s.acc.gsi.de
Server: 10.253.128.10
Address 1: 10.253.128.10 kube-dns.kube-system.svc.a.k8s.acc.gsi.de
Name: my-service.acoapp.svc.k8s.acc.gsi.de
Address 1: 10.253.1.29 10-253-1-29.my-service.acoapp.svc.a.k8s.acc.gsi.de
Works for different names, always returns 10.253.1.29. Busybox implementation of nslookup also shows the reverse (ip to name) lookup for the ip 10.253.1.29, so we also get the name 10-253-1-29.my-service.acoapp.svc.a.k8s.acc.gsi.de
Clusterip
Our http echo was a single pod. Let's transform it to a deployment with three replicas.
my-echo-deployment.yaml
to show the differences between the different access methods. Our main usage will be headless (see above) or loadbalancer with a single pod (see below).
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-echo-deployment
labels:
app: my-tutorial
spec:
replicas: 3
revisionHistoryLimit: 0
selector:
matchLabels:
app: my-tutorial
svc: my-echo
template:
metadata:
labels:
app: my-tutorial
svc: my-echo
spec:
containers:
- name: my-echo
image: aco/http-echo:1.31.1
Roll it out
[handel@vmlb016 ~]$ kubectl apply -f my-echo-deployment.yaml
deployment.apps/my-echo-deployment created
check what is running
[handel@vmlb016 ~]$ kubectl get deployment my-echo-deployment -o wide
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
my-echo-deployment 3/3 3 3 6s my-echo aco/http-echo:1.31.1 app=my-tutorial,svc=my-echo
[handel@vmlb016 ~]$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-echo 1/1 Running 0 3h46m 10.253.1.29 vmlb042
my-echo-deployment-577848b898-6zzrz 1/1 Running 0 12s 10.253.3.29 vmlb044
my-echo-deployment-577848b898-cqrr9 1/1 Running 0 12s 10.253.1.31 vmlb042
my-echo-deployment-577848b898-m56ds 1/1 Running 0 12s 10.253.2.13 vmlb043
my-shell 1/1 Running 0 3h57m 10.253.3.27 vmlb044
Three instances, and our old pod.
Our Service will now use all four pods, they all provide svc=http-echo.
Check the dns records
kubeshell # nslookup -type=a my-service
Server: 10.253.128.10
Address 1: 10.253.128.10 kube-dns.kube-system.svc.a.k8s.acc.gsi.de
Name: my-service
Address 1: 10.253.1.29 10-253-1-29.my-service.acoapp.svc.a.k8s.acc.gsi.de
Address 2: 10.253.2.13 10-253-2-13.my-service.acoapp.svc.a.k8s.acc.gsi.de
Address 3: 10.253.1.31 10-253-1-31.my-service.acoapp.svc.a.k8s.acc.gsi.de
Address 4: 10.253.3.29 10-253-3-29.my-service.acoapp.svc.a.k8s.acc.gsi.de
Remove the leftover pod (
kubectl delete pod my-echo
) so now we have three running pods providing my-echo.
Let's fire ten requests
kubeshell # for i in $(seq 10); do wget -q -O - http://my-service/cgi-bin/index.cgi | grep HOSTNAME; done
HOSTNAME=my-echo-deployment-577848b898-m56ds
HOSTNAME=my-echo-deployment-577848b898-6zzrz
HOSTNAME=my-echo-deployment-577848b898-6zzrz
HOSTNAME=my-echo-deployment-577848b898-cqrr9
HOSTNAME=my-echo-deployment-577848b898-6zzrz
HOSTNAME=my-echo-deployment-577848b898-6zzrz
HOSTNAME=my-echo-deployment-577848b898-6zzrz
HOSTNAME=my-echo-deployment-577848b898-6zzrz
HOSTNAME=my-echo-deployment-577848b898-6zzrz
HOSTNAME=my-echo-deployment-577848b898-6zzrz
Requests are distributed. Not equal, but distributed. The dns lookup will order the replies in a random order and the wget will pick the first one.
DNS Loadbalancing is ok. It might even be the right choice. And in our gsi main usecase without replicas it often will be as there is no need to balance at all.
Advantages of DNS based loadbalancing are less ips, no network address translations, visible, easier to debug. Disadvantages are loadbalancing is a client side choice, no sticky loadbalancing, slower change propagation (worst case 30sec).
But we might want more control on the balancing (more even distribution or sticky sessions, once a client is connected always return the same pod). For this we can use a single clusterIP (where cluster means inside-a-kubernetes-cluster). Actualy we will simply not specify one (which is different from specifying none) and will automatically get one assigned. Update
my-service.yaml
and remove the clusterip line.
apiVersion: v1
kind: Service
metadata:
name: my-service
labels:
app: my-tutorial
spec:
type: ClusterIP
selector:
svc: my-echo
ports:
- protocol: TCP
port: 80
We could specify a clusterip, if we know which ip ranges are allowed for the cluster. Or we could not specify one and let the cluster handle it by itself, choosing a free ip from the right range and assign it. As we are not interested what the actual IP is (that's for computers, we will use names and rely on dns) and don't want to adapt if the cluster config changes, leave it out.
[handel@vmlb016 ~]$ kubectl apply -f my-service.yaml
The Service "my-service" is invalid: spec.clusterIP: Invalid value: "": field is immutable
[handel@vmlb016 ~]$ kubectl delete service my-service
service "my-service" deleted
[handel@vmlb016 ~]$ kubectl apply -f my-service.yaml
service/my-service created
[handel@vmlb016 ~]$ kubectl get services -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
my-service ClusterIP 10.253.171.21 80/TCP 5s svc=my-echo
We could not directyly modify the existing service. So we deleted it and created a new one.
We now have an IP address. Check how kubeshell resolves it
kubeshell # nslookup -type=a my-service
Server: 10.253.128.10
Address 1: 10.253.128.10 kube-dns.kube-system.svc.a.k8s.acc.gsi.de
Name: my-service
Address 1: 10.253.171.21 my-service.acoap.svc.a.k8s.acc.gsi.de
One address only. Now send ten requests and check the balancing
kubeshell # for i in $(seq 10); do wget -q -O - http://my-service/cgi-bin/index.cgi | grep HOSTNAME; done
HOSTNAME=my-echo-deployment-577848b898-6zzrz
HOSTNAME=my-echo-deployment-577848b898-m56ds
HOSTNAME=my-echo-deployment-577848b898-cqrr9
HOSTNAME=my-echo-deployment-577848b898-6zzrz
HOSTNAME=my-echo-deployment-577848b898-m56ds
HOSTNAME=my-echo-deployment-577848b898-cqrr9
HOSTNAME=my-echo-deployment-577848b898-6zzrz
HOSTNAME=my-echo-deployment-577848b898-m56ds
HOSTNAME=my-echo-deployment-577848b898-cqrr9
HOSTNAME=my-echo-deployment-577848b898-6zzrz
As we are the only client we can easily see the very nice round robin distribution of the requests.
Note: for a deployment kubernetes will create a replicaset. Updating a deployment will create a new replicaset and keep a history of old ones. setting revisionHistoryLimit to 0 (the default is 10) will automaticly remove all old replicasets.
Note: besides starting multiple instances, deployments can be used to seamlessly restart services.
kubectl rollout restart deployment my-echo-deployment
. Kubernetes will first create new pods, wait until they are online (healthy). Put them into balancing. Remove balancing from the old pods. Wait until all connections to old pods finish and then terminate the old pods.
Loadbalancer
Providing services to the outside (non-kubernetes) network requires the use of a loadbalancer. That's right kubernetes is thinking big for scaling to cloud sized installations. There are other options, but most cases will end up with some sort of loadbalancer pretty soon. Even if they are only using one pod it is more practial to directly use a loadbalancer.
Let's create a loadbalanced service
my-loadbalancer.yaml
apiVersion: v1
kind: Service
metadata:
name: my-loadbalancer
labels:
app: my-tutorial
spec:
type: LoadBalancer
selector:
svc: my-echo
ports:
- protocol: TCP
port: 80
Same as a service, we changed the type.
[handel@vmlb016 ~]$ kubectl apply -f my-loadbalancer.yaml
service/my-loadbalancer created
[handel@vmlb016 ~]$ kubectl get services -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
my-loadbalancer LoadBalancer 10.253.164.185 10.253.192.1 80:31161/TCP 9s svc=my-echo
...
we have an internal ip, an external ip, and some strange port which we can
ignoreLess ...The colon separates internal port (80) from something called nodeport (31161). loadbalancers would access the host running kubernetes on this port and then redirect to the internal port. It is a "feature" of kubernetes. Loadbalancer services inherit from Nodeport services even if the loadbalancer does not use the nodeport. This limits the total number of loadbalancers (external services) in the kubernetes cluster. See k8s_bug#69845 and k8s_enhance#1514 .
So let's access it
kubeshell # wget -q -O - http://10.253.164.185/cgi-bin/index.cgi | grep HOSTNAME
HOSTNAME=my-echo-deployment-577848b898-6zzrz
kubeshell # wget -q -O - http://10.253.192.1/cgi-bin/index.cgi | grep HOSTNAME
HOSTNAME=my-echo-deployment-577848b898-6zzrz
kubeshell # wget -q -O - http://my-loadbalancer/cgi-bin/index.cgi | grep HOSTNAME
HOSTNAME=my-echo-deployment-577848b898-6zzrz
Internally it works, we can access internal ip, external ip and by name.
And outside of kubernetes from the acc7 centos cluster?
[handel@asl740]$ wget -q -O - http://10.253.192.1/cgi-bin/index.cgi | grep HOSTNAME
HOSTNAME=my-echo-deployment-577848b898-cqrr9
[handel@asl740]$ wget -q -O - http://my-loadbalancer/cgi-bin/index.cgi | grep HOSTNAME
[handel@asl740]$ wget -q -O - http://my-loadbalancer.acoap.lb.a.k8s.acc.gsi.de/cgi-bin/index.cgi | grep HOSTNAME
HOSTNAME=my-echo-deployment-577848b898-6zzrz
No shortnames outside of kubernetes. But there is an external naming scheme servicename.namespace.lb.a.k8s.acc.gsi.de which can be resolved.
Note: currently also the internal names are accessible. This will change without notice!
Masquerading
The loadbalancer adds source address translation (SNAT) and hides the ip address of the client connecting to it
kubeshell # wget -q -O - http://my-loadbalancer/cgi-bin/index.cgi | grep REMOTE_ADDR
REMOTE_ADDR=[::ffff:10.253.3.27]
[handel@asl740]$ wget -q -O - http://my-loadbalancer.acoapp.lb.a.acc.gsi.de/cgi-bin/index.cgi|grep REMOTE_ADDR
REMOTE_ADDR=[::ffff:10.252.254.41]
internally everything is ok. But externally we see the ip address of one kubernetes node, in this case vmlb041.
Traffic Policy
We can influence external traffic routing. This is dependant on the network setup of the kubernetes cluster.
apiVersion: v1
kind: Service
metadata:
name: my-loadbalancer
labels:
app: my-tutorial
spec:
type: LoadBalancer
externalTrafficPolicy: Local
selector:
svc: my-echo
ports:
- protocol: TCP
port: 80
Advantages: less masquerading. Less BGP routes in the infrastructure switches (yeah! do it, it saves resources)
Disadavantages: None, if your service balances across one or two pods. Starting with three pods load balancing can be uneven. Traffic is distributed by node and then per pod. For three pods this could result in one pod getting 50% of traffic and the others each 25%. This can be avoided by evenly spreading pods on the available nodes (or placing them all on the same node). This is called pod anti-affinity pattern. Wiki documentation might follow later.
Limitations
you should not access services and loadbalancer IP from it's own pod. There where some issues with it. As of
kubernetes#97081 this should now be working. But it's probably better to avoid it and the performance is better (not roundtrip to proxy/loadbalancer/etc).
You can use a maximum of eight nodes to schedule your pods, but you can have more then one pod. Pods on other nodes won't receive traffic.
Cleanup
we created a pod for the shell, a deployment set (which spawns three pods), a service and a loadbalancer. All of this needs to be cleaned up.
We used a label app=my-tutorial on all of these objects. We can use this
[handel@vmlb016 ~]$ kubectl get all -l app=my-tutorial
NAME READY STATUS RESTARTS AGE
pod/my-echo-deployment-577848b898-6zzrz 1/1 Running 0 28m
pod/my-echo-deployment-577848b898-cqrr9 1/1 Running 0 28m
pod/my-echo-deployment-577848b898-m56ds 1/1 Running 0 28m
pod/my-shell 1/1 Running 0 4h25m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/my-loadbalancer LoadBalancer 10.253.164.185 10.253.192.1 80:31161/TCP 17m
service/my-service ClusterIP 10.253.171.21 80/TCP 20m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/my-echo-deployment 3/3 3 3 28m
NAME DESIRED CURRENT READY AGE
replicaset.apps/my-echo-deployment-577848b898 3 3 3 28m
The replicaset got automaticly created by the deployment.
And wipe them all at once including our kubeshell
[handel@vmlb016 ~]$ kubectl delete all -l app=my-tutorial
pod "my-echo-deployment-577848b898-6zzrz" deleted
pod "my-echo-deployment-577848b898-cqrr9" deleted
pod "my-echo-deployment-577848b898-m56ds" deleted
pod "my-shell" deleted
service "my-loadbalancer" deleted
service "my-service" deleted
deployment.apps "my-echo-deployment" deleted
replicaset.apps "my-echo-deployment-577848b898" deleted
--
ChristophHandel - 17 Apr 2020