{"id":2909,"date":"2024-04-16T20:29:33","date_gmt":"2024-04-16T20:29:33","guid":{"rendered":"https:\/\/brakkee.org\/site\/?p=2909"},"modified":"2024-04-20T11:14:10","modified_gmt":"2024-04-20T11:14:10","slug":"my-approach-to-the-certified-kubernetes-security-specialist-certification","status":"publish","type":"post","link":"https:\/\/brakkee.org\/site\/2024\/04\/16\/my-approach-to-the-certified-kubernetes-security-specialist-certification\/","title":{"rendered":"My approach to the Certified Kubernetes Security Specialist certification"},"content":{"rendered":"<p>In April 2024 I successfully passed the <a href=\"https:\/\/brakkee.org\/site\/wp-content\/uploads\/2024\/04\/cks-20240407-1.pdf\">CKS certification exam<\/a>, but compared to the <a href=\"https:\/\/brakkee.org\/site\/wp-content\/uploads\/2024\/04\/ckad-20230102.pdf\">Certified Kubernetes Application Developer<\/a> and <a href=\"https:\/\/brakkee.org\/site\/wp-content\/uploads\/2024\/04\/cka-20230523.pdf\">Certified Kubernetes Administrator<\/a> this was the toughest one yet. Not because the exam is particularly hard. The questions were all in closed form, I guess so that automatic grading is possible, but the main difficulties are that:<\/p>\n<ul>\n<li>there are many new topics such as runtime security with Falco, Seccomp, Apparmor, and gvisor, as well as security scanning tools such as trivy and kube-bench.<\/li>\n<li>the time pressure during the exam is high. I experienced that first hand by not being able to finish 1 of the questions. To go fast you need to prepare things very well. In particular, I setup my own single node kubernetes cluster using Vagrant so I could test out many topics using a standard kubeadm setup.<\/li>\n<li>the auditing during the exam was the worst experience yet. The proctor took more than 30 minutes to release the exam so I could start. Additionally, the proctor interrupted me twice, I think, for no good reason that caused a huge interruption of my flow.<\/li>\n<\/ul>\n<h2>Why I took the exam<\/h2>\n<p><!--more--><\/p>\n<p>I already knew ghat as a result from studying for the CKAD and CKA exams, I became much quicker in optimal use of the command-line and online docs that really to this day helps me to get things done more quickly. Also, if there is something to be configured, like roles and rolebinding, and service accounts or other things it just seems easy now. In a way, it shifts boundaries. Having a good overview of what kubernetes really allows to make better design decisions and things that seemed daunting before have become easy now.<\/p>\n<p>With the CKA in particular I got a much better understanding of how the different components of kubernetes work together. As a result of that it became a lot easier to rescue my home cluster in case of problems and became a lot more confident and succesful in fixing things. For both CKAD and CKA there was time pressure, but not as much as for CKS. With CKA for instance, I was finished in 1.5 hours, leaving 30 minutes for troubleshooting and fixing questions where I had doubts.<\/p>\n<p>Then after finishing the CKA exam, I was so happy that I immediately bought the CKS exam, especially after getting a huge discount and getting the CKS exam for just 150 USD. Then, nothing happened, I did not study at all for it and met some people at a CNCF kubernetes day in December and talked about CKS. That reminded me again to take exam, so slowly over the course of january I started to study for the exam. The kodekloud course wasn&#8217;t that good in my opinion so I watched another course on <a href=\"https:\/\/www.youtube.com\/watch?v=d9xfB5qaOfg\">youtube<\/a> to get a more complete picture. After that a lot of practice using questions from kodekloud, killer shell exam preparation, and some exercises that I defined myself. All in all, a lot of preparation went into this.<\/p>\n<p>Also, the CKS exam was my goal from the start because of the subjects covered, and CKA is a requirement for CKS. I am happy I achieved this goal now.<\/p>\n<h2>Tips and tricks<\/h2>\n<p>Below I will list my own tips and tricks. Many of the things here I are based on validation, and some essential checks not to get blocked right at the beginning. Also there are some speed tips that can be really useful. The tips and tricks are focused on the tools, not on the questions you may get on the exam.<\/p>\n<h3>Basic approach<\/h3>\n<p>My standard settings in .bashrc were:<\/p>\n<p><span style=\"font-size: 10pt; font-family: 'courier new', courier, monospace;\">export DRYRUN=&#8221;&#8211;dry-run=client -o yaml&#8221;<\/span><br \/>\n<span style=\"font-size: 10pt; font-family: 'courier new', courier, monospace;\">alias kls=&#8217;kubectl config get-contexts&#8217;<\/span><br \/>\n<span style=\"font-size: 10pt; font-family: 'courier new', courier, monospace;\">alias kns=&#8217;kubectl config set-context &#8211;current &#8211;namespace&#8217;<\/span><br \/>\n<span style=\"font-size: 10pt; font-family: 'courier new', courier, monospace;\">alias kctx=&#8217;kubectl config use-context&#8217;<\/span><\/p>\n<p>Especially the <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">kns<\/span> macro I used a lot.<\/p>\n<p>I created a separate directory for every question, named q1, q2, etc. If there was a question I needed to get back to I simply touched a file ~\/q1.checkfinelresult (or something more specific).\u00a0 Make sure to backup input files so you can always go back if needed.<\/p>\n<p>Be really quick in the use of the command line. Use kubectl as much as possible. Make use of <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">kubectl api-resources<\/span> and <span style=\"font-size: 10pt; font-family: 'courier new', courier, monospace;\">kubectl explain<\/span> where needed. This is always faster than the online docs.<\/p>\n<p>Copy snippets from the kubernetes documentation website to a local templates directory so you can reuse them. This is faster then looking them up a second time.<\/p>\n<p>Use the killer shell practice exams provided with the CKS<\/p>\n<p>Use these exams to get to know the exam environment. In particular cut and paste is important. In my case it was selecting items on the question using the left mouse (weird and unnatural), then pasting in a terminal using right-click paste. Cutting and pasting from firefox running in the virtual desktop is standard linux using the middle mouse for paste. Also, it cannot hurt to use firefox while preparing so that you are used to firefox at the exam.<\/p>\n<p>Also check how you can reduce the font size because the default size is too big.<\/p>\n<h3>Use yamllint<\/h3>\n<p>Install yamllint when checking yaml after modifying files, especially in \/etc\/kubernetes\/manifests. Install it using <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">apt install yamllint -y<\/span>. All errors of yamllint about spaces can be ignored. But <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">yamllint<\/span> showing duplicates can be a problem since a second tag overrides the first one. I am sure that this cost me some points at the CKA and CKAD exams.<\/p>\n<h3>Quick restarts<\/h3>\n<p>Use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">kubectl delete pod <\/span><em>&lt;pod&gt;<\/em> <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">&#8211;now<\/span> or <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">kubectl delete pod <em><span style=\"font-family: georgia, palatino, serif; font-size: 12pt;\">&lt;pod&gt;<\/span><\/em> &#8211;force &#8211;grace-period=0<\/span> to quickly delete a pod. No sense waiting for too long.<\/p>\n<p>To restart the apiserver after a config change, you can either wait until it is restarted automatically or kill the apiserver process from the master and do a <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">systemctl restart kubelet<\/span>. Guess which one is (a lot) faster?<\/p>\n<h3>Investigating container processes on the host<\/h3>\n<p>To find out the pod name of a container process from the host, use either<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">\u00a0 nsenter -t &lt;pid&gt; -u hostname<\/span><\/p>\n<p>or<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">\u00a0 cat \/proc\/&lt;pid&gt;\/environ | strings | grep HOSTNAME<\/span><\/p>\n<p>This identifies the hostname which is (usually) identical to the pod name.<\/p>\n<p>Given a container, use<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">\u00a0 crictl inspect &lt;containerid&gt; | grep pid<\/span><\/p>\n<p>to identify the pid of the main container process on the host. This is the first pid in the output.<\/p>\n<p>The entire file system as a process sees it is at <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">\/proc\/&lt;pid&gt;\/root<\/span>. This can be used to quickly check whether a given volume mount is already working. I.e. is my config file already visible at the correct location by the process that needs it.<\/p>\n<p>Apiserver troubleshooting. Use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">crictl ps -a | grep apiserver<\/span> to get the container id of the failed process. Use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">crictl logs &lt;containerid&gt;<\/span> to get the logs of the failed startup.<\/p>\n<h3>Falco<\/h3>\n<p>First thing to find out with falco is to find out how it is running. In most cases it will be running as a systemd service named falco, at least in all the courses I have seen. However, installing falco yourself on a single-node cluster reveals that there are many ways to run falco using different services.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Tip 1<\/strong>: Identify what systemd service falco is using.<br \/>\n<span style=\"font-size: 10pt;\"><span style=\"font-family: 'courier new', courier, monospace;\">systemctl list-unit-files | grep falco<\/span><br \/>\n<span style=\"font-size: 12pt;\">And identify the command line used to run falco.\u00a0<\/span><\/span><\/p>\n<p>This will identify the falco service that is actually running. Now use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">systemctl status falco-bpf<\/span> (or whatever service was used) to find the path to the service file. From that service file get the command that is used to run falco, which can be useful later.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Tip 2<\/strong>: If you adapt rules use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">falco -V rulesfile.yaml<\/span> to validate rules.<\/p>\n<p>Here, the rules file can be any of the rules files in<span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\"> \/etc\/falco<\/span> or <span style=\"font-size: 10pt; font-family: 'courier new', courier, monospace;\">\/etc\/falco\/rules.d<\/span>. Systemd somehow does not show errors at falco startup in a consistent way.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Tip 3<\/strong>: If you are asked to quickly identify containers, pods, or kubernetes namespaces add the <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">-pk<\/span> flag to the falco startup.<\/p>\n<p>This allows you to use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">%container.info<\/span> in output formatting, which prints out a lot of statistics about a container including kubernetes namespace and pod.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Tip 4<\/strong>: Use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">falco &#8211;list<\/span> and don&#8217;t use the online documentation at falco.org\/docs<\/p>\n<p>This is easy, using the command line makes it fast. Also remember some of the important categories such as evt, proc, and k8s. Use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">falco &#8211;list | grep &#8216;^proc&#8217;<\/span> for instance to see all process formatting options.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Tip 5<\/strong>: When you need output in a certain format, add <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">-p &#8220;:RULE %evt.time,%proc.cmdline,&#8230;&#8221;<\/span> to the options<\/p>\n<p>Using the -p option allows you to append the given output to every output rule. This is fast since it allows you to avoid editing rule files. The best approach for production would be to identify the rules that require modification, copy them into <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">falco_rules.local.yaml<\/span> and then edit their output fields. However this is slow. An advantage of prefixing the additional output with &#8220;<span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">:RULE<\/span> &#8221; is that is allows you to quickly filter out the existing rule text when finally saving the required output to a file. With tip 5, tip 2 is no longer needed of course.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Tip 6<\/strong>: Run falco in the foreground instead of as a service.<\/p>\n<p>Stopping the falco service and running it by hand based on the command line identified from tip 1 and extending it based on tip 5 has many advantages: troubleshooting is quick since errors will be logged to the terminal as well as the rule output. Also, it becomes easy to just let it run for a given amount of time after which you copy\/paste the output into a file. Then filter it to remove the original rule text. Note that you can also script running falco for some time but then output buffering can be an issue and you need to use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">stdbuf -oL<\/span> to force line buffering.<\/p>\n<p>All in all, these tips can save you a lot of time with this task. I went back from 18 minutes to around 5 when using tips 5 and 6.<\/p>\n<h3>Seccomp<\/h3>\n<p>Seccomp is relatively easy. The syntax for seccomp in a pod yaml is simple, just memorize it. Also memorize the base path of the kubelet which is <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">\/var\/lib\/kubelet\/seccomp<\/span>.<\/p>\n<p>In addition, verify that seccomp is being used by process using <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">grep -i seccomp \/proc\/&lt;pid&gt;\/status<\/span>. This should show 2 when a process is configured with a specific json profile.<\/p>\n<p>Even better, use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">crictl inspect &lt;container&gt; | jq &#8216;.. | objects | .seccomp \/\/ empty&#8217; <\/span>to show the actual json profile in use by the container. Or use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">crictl inspect &lt;container&gt; | jq &#8216;.. | objects | .seccomp&#8217; <span style=\"font-family: georgia, palatino, serif; font-size: 12pt;\">and ignore the null values. <\/span><\/span>This provides a deeper validation than just using <span style=\"font-size: 10pt; font-family: 'courier new', courier, monospace;\">\/proc\/&lt;pid&gt;\/status<\/span>.<\/p>\n<p>When looking at the logs issued by seccomp using<span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\"> journalctl -x | grep -i seccomp<\/span>, map between system call codes and names using <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">ausyscall<\/span>.<\/p>\n<h3>Apparmor<\/h3>\n<p>Remember: for <strong>S<\/strong>eccomp we use the <strong>s<\/strong>ecurityContext to configure it and for <strong>a<\/strong>pparmor we use <strong>a<\/strong>nnotations. Apparmor is not that hard so memorize the annotation. I memorized it in parts, namely <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">container.apparmor.security<\/span> followed by <span style=\"font-size: 10pt; font-family: 'courier new', courier, monospace;\">beta.kubernetes.io<\/span> followed by <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">\/<em>&lt;CONTAINER&gt;<\/em><\/span>, with value of either <span style=\"font-family: 'courier new', courier, monospace;\">localhost\/<span style=\"font-size: 10pt;\"><em>&lt;PROFILE&gt;<\/em><\/span><\/span>, <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">runtime\/default<\/span>, or <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">unconfined<\/span>. Note that <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\"><em>&lt;PROFILE&gt;<\/em><\/span> is the name of the profile as defined inside the apparmor file, not the name of the file.<\/p>\n<p>To check your work use <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">ps auxZ<\/span> or for a process tree <span style=\"font-size: 10pt; font-family: 'courier new', courier, monospace;\">ps auxZ &#8211;forest<\/span>. The <span style=\"font-family: 'courier new', courier, monospace;\"><span style=\"font-family: courier new, courier, monospace;\"><span style=\"font-size: 13.3333px;\">Z<\/span><\/span><\/span>\u00a0flag causes the apparmor profile to be listed. Also know your tools such as <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">apparmor_parser <\/span>to load profiles, and the various <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">aa-*<\/span> commands.<\/p>\n<p>If all else fails, then consult the documentation (at work I usually do it in the other way around) . Practice this a number of times in your own environment.<\/p>\n<h3>Image policy webhook<\/h3>\n<p>For image policy webhook test out the error behavior of kubernetes for when you make mistakes in the configuration. That way, you know that when the apiserver comes back up, that a number of things are already ok.<\/p>\n<p>Here is the error behavior I found in kubernetes 1.29 for when the ImagePolicyWebhook is added to the enabled admission plugins<\/p>\n<ul>\n<li>image policy webhook configured: apiserver does not start and logs error<\/li>\n<li>image policy absent in config file: same<\/li>\n<li>no kube config: same<\/li>\n<li>no URL in kube ocnfig: same<\/li>\n<li>host not found in kube config: kubectl error &#8216;no such host&#8217; when defaultAllow false, silent otherwise<\/li>\n<li>wrong URL to existing host: kubectl error &#8216;the server could not find the requested resource&#8217;<\/li>\n<\/ul>\n<p>So what do you know when the apiserver runs? Final thing to check is if the apiserver logs that image policy webhook is enabled.<\/p>\n<p>For troubleshooting, add the <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">&#8211;v=8 <\/span>flag to the apiserver. Then restart the apiserver and grep the logs (using crictl as before) using <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">egrep -i &#8216;imagepolicy|<em><span style=\"font-family: georgia, palatino, serif; font-size: 12pt;\">&lt;HOST&gt;&#8217;<\/span><\/em><\/span> where <em>&lt;HOST&gt;<\/em> is the hostname specified in the kube config used by the webhook. You should see the admission review, URL of the image policy webhook, and admission review response in the logs.<\/p>\n<h3>Use your own single-node kubeadm cluster with vagrant<\/h3>\n<p>Using vagrant it is easy to setup a cluster. Using a single node kubeadm cluster allows you to tryout anything you want.<\/p>\n<p>See <a href=\"https:\/\/git.wamblee.org\/blog\/code\/src\/branch\/main\/single-node-kubernetes\">here<\/a> for the vagrant setup. I used this mostly on linux with libvirt but also on windows with virtualbox. I used <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">vagrant snapshot save base<\/span> to create a snapshot after the cluster is running and restore it using <span style=\"font-family: 'courier new', courier, monospace; font-size: 10pt;\">vagrant snapshot restore base<\/span>.<\/p>\n<p>The setup uses ubuntu 20.04 similar to the exam but I used it also with debian 12.<\/p>\n<h2>Final thoughts<\/h2>\n<p>I hoped these tips will help someone during the exam. Because of the time pressure in the exam, I recommend doing validation if it can be done quickly and otherwise move on to the next question. This is particularly an issue with network policies that require more time to validate depending on the circumstances. If you are confident, then move on and validate later. Some people recommend doing the questions with most points first, which can also work, but I opted for just going ahead and doing them one by one since investigating what questions to do first also takes time.<\/p>\n<p>The exam experience itself was horrible which includes the long intake procedure, interruptions by the proctor during the exam, and the time pressure. This was definitely the worst exam experience yet. I am not going to renew these kubernetes certifications in the future since I am working full time in this area now, renewal costs are just as high as the initial certification, and the bad exam experience.<\/p>\n<p>However, I still think it was very useful to do all these certifications and I learned a lot preparing for them. In particular, I think I have become more security aware now as a result of the CKS and I have a better overview of the types of security measures that can be taken. It will definitely help me in the future.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In April 2024 I successfully passed the CKS certification exam, but compared to the Certified Kubernetes Application Developer and Certified Kubernetes Administrator this was the toughest one yet. Not because the exam is particularly hard. The questions were all in &hellip; <a href=\"https:\/\/brakkee.org\/site\/2024\/04\/16\/my-approach-to-the-certified-kubernetes-security-specialist-certification\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[10],"tags":[],"_links":{"self":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/2909"}],"collection":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/comments?post=2909"}],"version-history":[{"count":58,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/2909\/revisions"}],"predecessor-version":[{"id":2972,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/2909\/revisions\/2972"}],"wp:attachment":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/media?parent=2909"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/categories?post=2909"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/tags?post=2909"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}