{"id":2790,"date":"2023-06-04T18:30:25","date_gmt":"2023-06-04T18:30:25","guid":{"rendered":"https:\/\/brakkee.org\/site\/?p=2790"},"modified":"2023-06-17T10:14:41","modified_gmt":"2023-06-17T10:14:41","slug":"converting-to-a-non-ha-kubeadm-kubernetes-setup-to-ha","status":"publish","type":"post","link":"https:\/\/brakkee.org\/site\/2023\/06\/04\/converting-to-a-non-ha-kubeadm-kubernetes-setup-to-ha\/","title":{"rendered":"Converting a non-HA kubeadm kubernetes setup to HA"},"content":{"rendered":"<p>When I setup my kubernetes cluster using kubeadm some years ago, I decided to use a simple non-HA setup of kubernetes, because (1) it simplifies the setup and (2) the cluster will be running on a single server anyway. In fact I am using kubernetes mainly for deployment flexibility and not as much for high availability. Now I am running a mix of centos 8 stream and centos 7 nodes which will all be end of life in June 2024. With kubernetes, doing such an upgrade should be easy:<\/p>\n<ul>\n<li>add a new controller node on a newer OS and join it to the cluster<\/li>\n<li>remove the old controller<\/li>\n<li>replace worker nodes one by one<\/li>\n<\/ul>\n<p>However, the first two steps will give problems since in a non-HA setup, the IP address of the API server is used by all components that connect to the API server. This will make it impossible to switch over to the new controller node in a transparent way.<\/p>\n<p><!--more--><\/p>\n<p>With a HA setup of kubernetes this is much easier since a hostname can be used instead of an IP address to connect to the API server. This makes it easy to perform the above migration. In its most basic form an<em> \/etc\/hosts<\/em> entry can be used on all nodes that resolve the hostname to the IP address of one of the masters. In a more advanced setup, the hostname could resolve to a load balancer. Therefore, to make the update possible, it is a good idea to migrate from a non-HA kubeadm setup to a HA setup.<\/p>\n<h2>Differences between a HA setup and non-HA setup<\/h2>\n<p>The main difference between a non-HA setup and a HA setup is just two command line options to <em>kubeadm init<\/em>:<\/p>\n<div>\n<pre>--upload-certs --control-plane-endpoint=master:6443<\/pre>\n<\/div>\n<p>The main effect of these flags is that the hostname master will be used instead of its IP address. Also, the certificates for the API server must support master as a requested host by setting it as the common name (CN) attribute or by adding is as one of the Subject Alternative Names (SANs) in the certificate.<\/p>\n<p>In addition, kubeadm stores information about the cluster in a number of config maps:<\/p>\n<table style=\"border-collapse: collapse; width: 100%;\" border=\"1\">\n<tbody>\n<tr style=\"height: 24px;\">\n<td style=\"width: 33.3333%; height: 24px;\"><strong>Namespace<\/strong><\/td>\n<td style=\"width: 33.3333%; height: 24px;\"><strong>Name<\/strong><\/td>\n<td style=\"width: 33.3333%; height: 24px;\"><strong>Difference non-HA\/HA<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 72px;\">\n<td style=\"width: 33.3333%; height: 72px;\">kube-system<\/td>\n<td style=\"width: 33.3333%; height: 72px;\">kubeadm-config<\/td>\n<td style=\"width: 33.3333%; height: 72px;\"><em>controlPlaneEndpoint: master:6443<\/em> in the <em>ClusterConfiguration<\/em><\/td>\n<\/tr>\n<tr style=\"height: 48px;\">\n<td style=\"width: 33.3333%; height: 48px;\">kube-system<\/td>\n<td style=\"width: 33.3333%; height: 48px;\">kube-proxy<\/td>\n<td style=\"width: 33.3333%; height: 48px;\"><em>server<\/em> field in <em>kubeconfig.conf<\/em><\/td>\n<\/tr>\n<tr style=\"height: 24px;\">\n<td style=\"width: 33.3333%; height: 24px;\">kube-system<\/td>\n<td style=\"width: 33.3333%; height: 24px;\">kubelet-config<\/td>\n<td style=\"width: 33.3333%; height: 24px;\">no changes.<\/td>\n<\/tr>\n<tr style=\"height: 48px;\">\n<td style=\"width: 33.3333%; height: 48px;\">kube-public<\/td>\n<td style=\"width: 33.3333%; height: 48px;\">cluster-info<\/td>\n<td style=\"width: 33.3333%; height: 48px;\"><em>server<\/em> field in the\u00a0 <em>kubeconfig<\/em><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The above differences where found by comparing a non-HA with a HA setup by setting up two clusters and comparing the end results.Making sure that the ConfigMaps are updated will support future upgrades of kubernetes using the standard<a href=\"https:\/\/kubernetes.io\/docs\/tasks\/administer-cluster\/kubeadm\/kubeadm-upgrade\/\"> kubeadm upgrade procedure<\/a>.<\/p>\n<p>The basic procedure is to first update the certificates of the API server so that it can also be accessed using the hostname <em>master<\/em> in the URL instead of its IP address. Then, the other components should use\u00a0<em>master\u00a0<\/em>to connect to the API server. Throughout the procedure we will use kubeadm commands as much as possible so that in the end we get a setup that is identical to a standard HA kubeadm setup.<\/p>\n<p>The procedure that will be discussed is inspired on\u00a0 this <a href=\"https:\/\/blog.scottlowe.org\/2019\/08\/12\/converting-kubernetes-to-ha-control-plane\/\">blog post<\/a>, but the procedure I will be using stays more close the standard kubeadm HA setup and achieves an end result that is more close to a HA setup since all configuration will be regenerated by kubeadm. For instance, we will be updating the certificates by adding the controlPlaneEndpoint flag to the kubadm-config instead of adding SANs. Before doing any of the procedures that follow, make sure to backup your \/etc\/kubernetes and \/var\/lib\/kubelet directories. The procedures all apply to kubernetes 1.26.4 with all nodes running Ubuntu 22.04.2 LTS. It was verified that after this step, a standard upgrade to kubernetes 1.27 was still possible.<\/p>\n<h2>Step 1: adding host entries for the control plane<\/h2>\n<p>In this step identify the IP address of the single control node and add a host entry on the controller and all worker nodes that points to this IP, e.g.:<\/p>\n<pre>192.168.121.247 master\r\n<\/pre>\n<h2>Step 2: Regenerate certificates<\/h2>\n<p>Get ClusterConfiguration file:<\/p>\n<pre>kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' &gt; kubeadm.yaml\r\n<\/pre>\n<p>and add<\/p>\n<pre>controlPlaneEndpoint: master:6443\r\n<\/pre>\n<p>at top-level.<br \/>\nNow regenerate the certificates:<\/p>\n<pre>rm -f \/etc\/kubernetes\/pki\/apiserver.* \r\nkubeadm init phase certs apiserver kubeadm --config kubeadm.yaml\r\n<\/pre>\n<p>The output of the above command should already show <em>master<\/em> being added as one of the SANs.<\/p>\n<p>Now verify the generated certificate:<\/p>\n<pre>openssl x509 -in \/etc\/kubernetes\/pki\/apiserver.crt  -text -noout\r\n<\/pre>\n<p>You should see <em>DNS:master<\/em> appear as one of the SANs.<\/p>\n<p>Next, restart the API server. This can be done by killing the API server manually, or by temporarily moving the <em>kube-apiserver.yaml<\/em> from the <em>\/etc\/kubernetes\/manifests<\/em> directory.<br \/>\nTo verify that it works, edit the .<em>kube\/config<\/em> file in your home directory and modify the server URL to use <em>master<\/em> instead of its IP. Then try some kubectl commands to check access.<\/p>\n<p>Finally, upload the modified cluster configuration:<\/p>\n<pre>kubeadm init phase upload-config --config kubeadm.yaml\r\n<\/pre>\n<h2>Step 3: edit config maps<\/h2>\n<p>Update the remaining ConfigMaps to update the server URL to use <em>master<\/em> instead of the IP address.<\/p>\n<pre>kubectl edit cm -n kube-public cluster-info\r\nkubectl edit cm -n kube-system kube-proxy\r\n<\/pre>\n<h2>Step 4: Update scheduler, controller manager, and kubelet config files<\/h2>\n<p>Next update the configuration files for the other components:<\/p>\n<pre>rm -f \/etc\/kubernetes\/*.conf\r\nkubeadm init phase kubeconfig all --config kubeadm.yaml\r\n<\/pre>\n<p>It is expected that after this, the scheduler and controller manager will still use the local api server instance. However, <em>admin.conf<\/em> and <em>kubelet.conf<\/em> should be using <em>master<\/em> now instead of the IP address for the master. This is the same to what you get in a standard HA kubeadm cluster setup.<\/p>\n<p>Now restart these components,<\/p>\n<pre>systemctl daemon-reload\r\nsystemctl restart kubelet \r\nkubectl delete pod -n kube-system -l component=kube-scheduler\r\nkubectl delete pod -n kube-system -l component=kube-controller-manager\r\nkubectl delete pod -n kube-system -l k8s-app=kube-proxy\r\n<\/pre>\n<p>and wait for all these to be running again.<\/p>\n<p>Next verify the etcd setup that that the peer URL is using the public IP of the controller node:<\/p>\n<pre>root@master1:\/etc\/kubernetes\/pki\/etcd# export ETCDCTL_API=3\r\nroot@master1:\/etc\/kubernetes\/pki\/etcd# etcdctl --cacert ca.crt --cert server.crt --key server.key member list --write-out table\r\n+------------------+---------+---------+------------------------------+------------------------------+\r\n|        ID        | STATUS  |  NAME   |          PEER ADDRS          |         CLIENT ADDRS         |\r\n+------------------+---------+---------+------------------------------+------------------------------+\r\n| 356b794c90ad4dca | started | master1 | https:\/\/192.168.121.247:2380 | https:\/\/192.168.121.247:2379 |\r\n+------------------+---------+---------+------------------------------+------------------------------+\r\n<\/pre>\n<p>I have seen cases where joining a controller noded failed because the peer URL was using localhost and this will give problems later if a second node is joined since the other node will use the advertized peer URL by the etcd server. If the peer address is wrong, then use<\/p>\n<pre>etcdctl member update MEMBERID --peer-urls=https:\/\/EXTERNAL_IP:2380\r\n<\/pre>\n<p>to fix it.<\/p>\n<h2>Step 5: update the kubelet on worker nodes<\/h2>\n<p>To update the kubelet on worker nodes edit <em>\/etc\/kubernetes\/kubelet.conf<\/em> to use <em>master<\/em><br \/>\ninstead of the IP address of the API server, and after that restart the kubelet.<\/p>\n<pre>systemctl daemon-reload \r\nsystemctl restart kubelet \r\n<\/pre>\n<h2>Step 6: setup new entries for all users in .kube\/config<\/h2>\n<p>As a result of this procedure the client certificate used to identify users will no longer work. Therefore, you must create new certificates for all users.<\/p>\n<h2>Final thoughts<\/h2>\n<p>By investigating the differences between a non-HA and HA setup, it becomes easy to identify the precise differences between a HA and non-HA kubernetes setup. Based on this and using kubeadm tools as much as possible for updating configuration files it is possible to migrate a non-HA cluster to a HA cluster which can then be used to upgrade all nodes to newer OS versions. The final cluster obtained could still be updated to the next kubernetes version using the standard <a href=\"https:\/\/kubernetes.io\/docs\/tasks\/administer-cluster\/kubeadm\/kubeadm-upgrade\/\">kubeadm upgrade procedure<\/a>.<\/p>\n<p>One major consequence is that you must regenerate all user certificates after upgrading the cluster to HA. This would normally be too much for a production setup, but there, you probably would have been using a cloud provider and would never run into the issue of having a non-HA cluster.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When I setup my kubernetes cluster using kubeadm some years ago, I decided to use a simple non-HA setup of kubernetes, because (1) it simplifies the setup and (2) the cluster will be running on a single server anyway. In &hellip; <a href=\"https:\/\/brakkee.org\/site\/2023\/06\/04\/converting-to-a-non-ha-kubeadm-kubernetes-setup-to-ha\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[10],"tags":[],"_links":{"self":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/2790"}],"collection":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/comments?post=2790"}],"version-history":[{"count":57,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/2790\/revisions"}],"predecessor-version":[{"id":2848,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/2790\/revisions\/2848"}],"wp:attachment":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/media?parent=2790"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/categories?post=2790"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/tags?post=2790"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}