{"id":2688,"date":"2023-02-14T19:46:27","date_gmt":"2023-02-14T19:46:27","guid":{"rendered":"https:\/\/brakkee.org\/site\/?p=2688"},"modified":"2023-02-26T12:37:32","modified_gmt":"2023-02-26T12:37:32","slug":"silencing-kubernetes-at-home","status":"publish","type":"post","link":"https:\/\/brakkee.org\/site\/2023\/02\/14\/silencing-kubernetes-at-home\/","title":{"rendered":"Running kubernetes etcd in-memory"},"content":{"rendered":"<p style=\"text-align: left;\">After setting up my kubernetes cluster at home back in June 2021, one of the first things I noticed was a lot more noise from the server. Apparently, it was just a lot of disk IO coming from kubernetes and in particular etcd. Therefore, I decided to fix this problem.<br \/>\n<!--more--><\/p>\n<p>Before we proceed it is important to note that my cluster is running on a single server, so high availability (apart from restarting failed containers) is not my aim. Of course, I will experiment with replicated storage such as <a href=\"https:\/\/longhorn.io\/\">longhorn<\/a> and perhaps <a href=\"https:\/\/rook.io\/\">ceph\/rook<\/a> in the future, but at the end of the day it is still a single server with a single kubernetes cluster and a single controller node. This means that running etcd from a ramdisk is an option. I am using a <a href=\"https:\/\/kubernetes.io\/docs\/setup\/production-environment\/tools\/kubeadm\/create-cluster-kubeadm\/\">kubeadm<\/a> setup of kubernetes so <em>etcd<\/em> is running as a pod inside my cluster.<\/p>\n<p>Running <em>etcd<\/em> from a ramdisk requires the following:<\/p>\n<ul>\n<li>regular backups of <em>etcd<\/em> state<\/li>\n<li>an additional backup just after the kubelet has stopped but containers (such as <em>etcd<\/em>) are still running.<\/li>\n<li>an additional restore of <em>etcd<\/em> before starting the kubelet<\/li>\n<li>mounting the storage directory <em>\/var\/lib\/etcd<\/em> as a ramdisk (tmpfs)<\/li>\n<\/ul>\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Systemd\">Systemd<\/a> is a replacement for init that has been in use for several years now. The good thing about systemd is that it allows modifying behavior. This fact is used by the pre startup hook and post stop hooks. These hooks are created by placing .conf files in the extension directory <em>\/usr\/lib\/systemd\/system\/kubelet.service.d<\/em>.<\/p>\n<p>The pre startup hook is as follows:<\/p>\n<pre>[Unit]\r\nAfter=containerd.service\r\n\r\n[Service]\r\nExecStartPre=-\/opt\/wamblee\/etcd\/bin\/etcd-restore-to-tmpfs\r\n<\/pre>\n<p>In this configuration, I have added a dependency on <em>containerd<\/em>, since the prestartup hook requires the container runtime to be started. Also a pre-startup script is triggered and because the script is prefixed with &#8216;-&#8216;, startup will fail when the restore fails.<\/p>\n<p>The post startup hook is similar:<\/p>\n<pre>[Service]\r\nExecStop=\/opt\/wamblee\/etcd\/bin\/etcd-cron\r\n<\/pre>\n<p>The scripts here contain all the intelligence for backing up and restoring <em>etcd<\/em> data. In particular, the backup image requires a running <em>etcd<\/em> and runs a backup command in a container. It preserves the last 10 backups taken and preserves one backup per day for the last 31 days. It also backs up the name of the docker image for etcd, since that may vary depending on the kubernetes version. That <em>etcd<\/em> container is than used by the restore script so we are certain that backup and restore always use the same version of <em>etcd<\/em> as the kubernetes cluster.<\/p>\n<p>To mount the storage directory \/var\/lib\/etcd as a ramdisk requires adding a single line to \/etc\/fstab:<\/p>\n<pre>tmpfs  \/var\/lib\/etcd   tmpfs   defaults,,noatime,size=2g  0 0 \r\n<\/pre>\n<p>The final step is to run the periodic backup using a cron script placed in <em>\/etc\/cron.d<\/em>:<\/p>\n<pre>*\/15 * * * * root \/opt\/wamblee\/etcd\/bin\/etcd-cron &gt; \/var\/log\/wamblee-etcd-backup 2&gt;&amp;1\r\n30 0 * * * root \/opt\/wamblee\/etcd\/bin\/etcdctl defrag --cluster &gt; \/var\/log\/wamblee-etcd-defrag 2&gt;&amp;1 \r\n<\/pre>\n<p>Note that above there is also a defragmentation task because shortly after setting up monitoring with prometheus, I got messages about <em>etcd<\/em> fragmentation.<\/p>\n<p>Initially. the scripts were based on docker, but since that time, kubernetes no longer uses docker by default, and I have switched to containerd. To do this transparently, I added a docker script that simply delegates to <em>containerd<\/em> using <em><a href=\"https:\/\/github.com\/containerd\/nerdctl\">nerdctl<\/a><\/em>. In the same way, the backup solution can be adapted to other container runtimes.<\/p>\n<p>See the full source code together with setup instructions <a href=\"https:\/\/git.wamblee.org\/blog\/code\/src\/branch\/main\/etcd-inmemory\">here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>After setting up my kubernetes cluster at home back in June 2021, one of the first things I noticed was a lot more noise from the server. Apparently, it was just a lot of disk IO coming from kubernetes and &hellip; <a href=\"https:\/\/brakkee.org\/site\/2023\/02\/14\/silencing-kubernetes-at-home\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[10],"tags":[],"_links":{"self":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/2688"}],"collection":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/comments?post=2688"}],"version-history":[{"count":21,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/2688\/revisions"}],"predecessor-version":[{"id":2757,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/2688\/revisions\/2757"}],"wp:attachment":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/media?parent=2688"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/categories?post=2688"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/tags?post=2688"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}