Converting a non-HA kubeadm kubernetes setup to HA

When I setup my kubernetes cluster using kubeadm some years ago, I decided to use a simple non-HA setup of kubernetes, because (1) it simplifies the setup and (2) the cluster will be running on a single server anyway. In fact I am using kubernetes mainly for deployment flexibility and not as much for high availability. Now I am running a mix of centos 8 stream and centos 7 nodes which will all be end of life in June 2024. With kubernetes, doing such an upgrade should be easy:

  • add a new controller node on a newer OS and join it to the cluster
  • remove the old controller
  • replace worker nodes one by one

However, the first two steps will give problems since in a non-HA setup, the IP address of the API server is used by all components that connect to the API server. This will make it impossible to switch over to the new controller node in a transparent way.

With a HA setup of kubernetes this is much easier since a hostname can be used instead of an IP address to connect to the API server. This makes it easy to perform the above migration. In its most basic form an /etc/hosts entry can be used on all nodes that resolve the hostname to the IP address of one of the masters. In a more advanced setup, the hostname could resolve to a load balancer. Therefore, to make the update possible, it is a good idea to migrate from a non-HA kubeadm setup to a HA setup.

Differences between a HA setup and non-HA setup

The main difference between a non-HA setup and a HA setup is just two command line options to kubeadm init:

--upload-certs --control-plane-endpoint=master:6443

The main effect of these flags is that the hostname master will be used instead of its IP address. Also, the certificates for the API server must support master as a requested host by setting it as the common name (CN) attribute or by adding is as one of the Subject Alternative Names (SANs) in the certificate.

In addition, kubeadm stores information about the cluster in a number of config maps:

Namespace Name Difference non-HA/HA
kube-system kubeadm-config controlPlaneEndpoint: master:6443 in the ClusterConfiguration
kube-system kube-proxy server field in kubeconfig.conf
kube-system kubelet-config no changes.
kube-public cluster-info server field in the  kubeconfig

The above differences where found by comparing a non-HA with a HA setup by setting up two clusters and comparing the end results.Making sure that the ConfigMaps are updated will support future upgrades of kubernetes using the standard kubeadm upgrade procedure.

The basic procedure is to first update the certificates of the API server so that it can also be accessed using the hostname master in the URL instead of its IP address. Then, the other components should use master to connect to the API server. Throughout the procedure we will use kubeadm commands as much as possible so that in the end we get a setup that is identical to a standard HA kubeadm setup.

The procedure that will be discussed is inspired on  this blog post, but the procedure I will be using stays more close the standard kubeadm HA setup and achieves an end result that is more close to a HA setup since all configuration will be regenerated by kubeadm. For instance, we will be updating the certificates by adding the controlPlaneEndpoint flag to the kubadm-config instead of adding SANs. Before doing any of the procedures that follow, make sure to backup your /etc/kubernetes and /var/lib/kubelet directories. The procedures all apply to kubernetes 1.26.4 with all nodes running Ubuntu 22.04.2 LTS. It was verified that after this step, a standard upgrade to kubernetes 1.27 was still possible.

Step 1: adding host entries for the control plane

In this step identify the IP address of the single control node and add a host entry on the controller and all worker nodes that points to this IP, e.g.:

192.168.121.247 master

Step 2: Regenerate certificates

Get ClusterConfiguration file:

kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' > kubeadm.yaml

and add

controlPlaneEndpoint: master:6443

at top-level.
Now regenerate the certificates:

rm -f /etc/kubernetes/pki/apiserver.* 
kubeadm init phase certs apiserver kubeadm --config kubeadm.yaml

The output of the above command should already show master being added as one of the SANs.

Now verify the generated certificate:

openssl x509 -in /etc/kubernetes/pki/apiserver.crt  -text -noout

You should see DNS:master appear as one of the SANs.

Next, restart the API server. This can be done by killing the API server manually, or by temporarily moving the kube-apiserver.yaml from the /etc/kubernetes/manifests directory.
To verify that it works, edit the .kube/config file in your home directory and modify the server URL to use master instead of its IP. Then try some kubectl commands to check access.

Finally, upload the modified cluster configuration:

kubeadm init phase upload-config --config kubeadm.yaml

Step 3: edit config maps

Update the remaining ConfigMaps to update the server URL to use master instead of the IP address.

kubectl edit cm -n kube-public cluster-info
kubectl edit cm -n kube-system kube-proxy

Step 4: Update scheduler, controller manager, and kubelet config files

Next update the configuration files for the other components:

rm -f /etc/kubernetes/*.conf
kubeadm init phase kubeconfig all --config kubeadm.yaml

It is expected that after this, the scheduler and controller manager will still use the local api server instance. However, admin.conf and kubelet.conf should be using master now instead of the IP address for the master. This is the same to what you get in a standard HA kubeadm cluster setup.

Now restart these components,

systemctl daemon-reload
systemctl restart kubelet 
kubectl delete pod -n kube-system -l component=kube-scheduler
kubectl delete pod -n kube-system -l component=kube-controller-manager
kubectl delete pod -n kube-system -l k8s-app=kube-proxy

and wait for all these to be running again.

Next verify the etcd setup that that the peer URL is using the public IP of the controller node:

root@master1:/etc/kubernetes/pki/etcd# export ETCDCTL_API=3
root@master1:/etc/kubernetes/pki/etcd# etcdctl --cacert ca.crt --cert server.crt --key server.key member list --write-out table
+------------------+---------+---------+------------------------------+------------------------------+
|        ID        | STATUS  |  NAME   |          PEER ADDRS          |         CLIENT ADDRS         |
+------------------+---------+---------+------------------------------+------------------------------+
| 356b794c90ad4dca | started | master1 | https://192.168.121.247:2380 | https://192.168.121.247:2379 |
+------------------+---------+---------+------------------------------+------------------------------+

I have seen cases where joining a controller noded failed because the peer URL was using localhost and this will give problems later if a second node is joined since the other node will use the advertized peer URL by the etcd server. If the peer address is wrong, then use

etcdctl member update MEMBERID --peer-urls=https://EXTERNAL_IP:2380

to fix it.

Step 5: update the kubelet on worker nodes

To update the kubelet on worker nodes edit /etc/kubernetes/kubelet.conf to use master
instead of the IP address of the API server, and after that restart the kubelet.

systemctl daemon-reload 
systemctl restart kubelet 

Step 6: setup new entries for all users in .kube/config

As a result of this procedure the client certificate used to identify users will no longer work. Therefore, you must create new certificates for all users.

Final thoughts

By investigating the differences between a non-HA and HA setup, it becomes easy to identify the precise differences between a HA and non-HA kubernetes setup. Based on this and using kubeadm tools as much as possible for updating configuration files it is possible to migrate a non-HA cluster to a HA cluster which can then be used to upgrade all nodes to newer OS versions. The final cluster obtained could still be updated to the next kubernetes version using the standard kubeadm upgrade procedure.

One major consequence is that you must regenerate all user certificates after upgrading the cluster to HA. This would normally be too much for a production setup, but there, you probably would have been using a cloud provider and would never run into the issue of having a non-HA cluster.

This entry was posted in Devops/Linux. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *