Using argocd with k3d to manage another k3d cluster

I am experimenting currently with argocd with the aim to have an (almost) fully automated bootstrapping of my kubernetes cluster at home. One of the first things to do when experimenting is to have a test environment. There are different deployment options for argocd to consider:

  • deploy argocd in the cluster that it is managing
  • deploy argocd in another cluster

Both have there advantages and disadvantages. With the first option, the advantage is that remote access is not required, but in that case secrets to access the git repository are available on the target cluster which might not be what you want. Also, there is some additional load on the target kubernetes cluster for polling the various git repositories that contain application definitions.

The advantage of the second one is that it is a more natural one. In effect, the cluster is not managing itself but is managed from the outside. Advantage is that git repo secrets are not required on the target cluster. Also, it allows the case where multiple kubernetes clusters must have the same configuration through the argocd ApplicationSet concept.

Environment setup

In this post I am investigating the second option for a development environment. There is one cluster k3d-xyz that contains the argocd deployment and another one k3d-abc that must be managed. Both clusters are created using k3d cluster create. Argo is installed on the k3d-xyz cluster as follows using helm:

helm repo add argo https://argoproj.github.io/argo-helm
helm install argo argo/argo-cd --namespace argocd --version 5.20.4 

Also the argocd command line is installed on the host using:

curl -sSL -o argocd-linux-amd64 https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64
install -m 555 argocd-linux-amd64 ~/bin/argocd
rm argocd-linux-amd64

Managing k3d-abc through k3d-xyz

The standard approach to take is simply to use

argocd cluster add k3d-abc

when the active context is k3d-xyz. This will however fail, for obvious reasons, since argocd takes the server configuration for k3d-abc from the .kube/config file on the host and that contains a URL that is only reachable from the host and not from the docker container in which k3d-xyz is running.

To deal with this we must configure the connection from k3d-xyz to k3d-abc manually using a secret.

The first step of the failed argocd cluster add command already created an argocd-manager service account on k3d-abc so we can reuse that.
The k3d-abc cluster must be added to the network of the k3d-xyz cluster:

docker network connect k3d-xyz k3d-abc-server-0

This allows the k3d-xyz cluster to access the API-server of k3d-abc on k3d-abc-server-0:6443. You can verify this by exec-ing into the server container of the k3d-xyz cluster and using telnet.

Next up is to obtain the bearer token of the argocd service account on the k3d-abc cluster:

kubectx k3d-abc
kubectl get sa -n kube-system argocd-manager
TOKEN="$( kubectl get secret -n kube-system \
            argocd-manager-token-cww95  -o json  | 
          jq  -r .data.token | base64 -d )"

Note that above the last part of the secret name can/will differ in your case (just use autocomplete). Also, do not use the

  kubectl create token  -n kube-system argocd-manager

command since that will create a time-limited token, and we want to use a token in this setup that does not expire.

Next step is to define the cluster secret:

kind: Secret
metadata:
  namespace: argocd
  name: k3d-abc-cluster-secret
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: k3d-abc
  server: "https://k3d-abc-server-0:6443"
  config: | 
    {
      "bearerToken": "TOKEN",
      "tlsClientConfig": { 
        "insecure": false, 
        "caData": "CADATA"
      } 
    }

Here, TOKEN is the value of the TOKEN variable above. CADATA is the CA data obtained from the .kube/config file for the k3d-abc cluster.

After this, you might need to stop and start the k3d-xyz cluster. This will refresh the DNS entries for the coredns server inside the k3d-xyz cluster so it can resolve k3d-abc-server-0.

With this approach, I can add an application on k3d-xyz to deploy it on k3d-abc:

kind: Application
metadata: 
  name: directory-app
  namespace: argocd
spec: 
  destination: 
    namespace: directory-app
    server: "https://k3d-abc-server-0:6443"
  project: default
  source: 
    path: guestbook-with-sub-directories
    repoURL: "https://github.com/mabusaa/argocd-example-apps.git"
    targetRevision: master
  syncPolicy:
    syncOptions:
      - CreateNamespace=true

Gotchas

The above approach is not perfect. There are issues with it when you restart your machine. In that case, the custom coredns configuration setup at initialuzation of the k3d cluster is lost.

You can verify this by looking at the coredns config map on the k3d-xyz cluster using

$  kubectl get cm -n kube-system coredns -o json | 
jq -r .data.NodeHosts
172.23.0.1 host.k3d.internal
172.23.0.3 registry.localhost
172.23.0.2 k3d-xyz-server-0
172.23.0.4 k3d-xyz-serverlb
172.23.0.5 k3d-abc-server-0

Here you should see the k3d-abc-server-0 host. If you don’t see this, then simply stopping and starting the k3d-xyz cluster will provide a fix:

k3d cluster stop k3d-xyz
k3d cluster start k3d-xyz

An alternative is to create a copy of the coredns configmap and reapply it on startup, then do a

kubectl rollout restart deploy -n kube-system coredns

However, restarting the argocd cluster is just as easy.

Final thoughts

The issue with k3d appears to be that upon a restart (even restarting docker will do), the coredns configmap is initialized without the entries from the docker network. For now, the workaround with the restart is the best I can get.

I also tried host mode networking for the k3d-xyz cluster which should fix the issue, but which also does not work for some reason. Also, I got some weird messages when getting the initial password for argocd like this:

$ kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
E0220 16:29:03.172722   28050 memcache.go:255] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request

Surprisingly, argocd cluster add also did not work out of the box. If it would have worked there would have been a limitation on having at most one argocd cluster running. For these reasone, I did not pursue this option any further.

This post is based on an answer that I gave recently on a github discussion.

This entry was posted in Devops/Linux. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *