Basic kubernetes infrastructure: RPM and container repo

As part of migrating all the stuff I have from virtual machines to a kubernetes infrastructure, some important pieces of infrastructure are needed. These are:

  • RPM repository: I use custom RPM repositories for setting up virtual machines. These same RPMs are used for building container images that are required by kubernetes
  • Docker repository: Custom docker images may be required for kubernetes

Since I want to run everything at home and make minimal use of internet services for my setup, I need to deploy solutions for this on my kubernetes cluster. Currently, I already use an RPM repository based on Nexus 2. In the mean while, a lot has happened. For instance Nexus 3 now natively supports RPM repositories and it also supports docker repositories. Therefore, as part of the setup, I need to run Nexus 3 and move all my RPM artifacts over from Nexus 2 to Nexus 3.

Deploying nexus

Nexus will be running in the wamblee-org namespace on kubernetes and an apache reverse proxy in the exposure namespace will be used for exposing it. See my earlier post for details about this setup. Note that that post is specifically about GKE but the main ideas apply here as well.

For deploying Nexus 3, the instructions for the Nexus 3 docker image can be used. When deploying the Nexus service, this will require a Service and a StatefulSet for deploying Nexus. A StatefulSet is more appropriate here than a Deployment since Nexus is a stateful service. This means that each pod of the StatefulSet will have its own unique storage. In my deployment I will use a replica count of 1 since I am running Nexus 3 OSS and high availability is not of concern for my home kubernetes setup.

First of all, let’s look at the StatefulSet:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nexus
  namespace: wamblee-org
spec:
  serviceName: nexus
  replicas: 1
  selector:
    matchLabels:
      app: nexus-server
  template:
    metadata:
      labels:
        app: nexus-server
    spec:
      containers:
        - name: nexus
          image: sonatype/nexus3:3.40.1
          resources:
            limits:
              memory: "4Gi"
              cpu: "10000m"
            requests:
              memory: "2Gi"
              cpu: "500m"
          ports:
            - containerPort: 8081
            - containerPort: 8082
          volumeMounts:
            - name: nexus-data
              mountPath: /nexus-data
  volumeClaimTemplates:
    - metadata:
        name: nexus-data
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi

There are several things to remark:

  • Replica count is 1. This is not a high available Nexus 3 deployment
  • It uses the standard Nexus 3 image with a fixed version so that we are guaranteed not to get surprise upgrades.
  • Two container ports are exposed: 8081 and 8082. The first port is dedicated to the web interface and interface used for maven artifacts and RPMs. The second port is specifically for Docker. Each hosted docker repository will have its own unique port. This is required by nexus. When exposing these repos externally, different host names will be used for port 8081 and port 8082 respectively.
  • I am running in the wamblee-org namespace which is the namespace where I am hosting everything for the wamblee.org domain.
  • A separate PersistentVolumeClaim is used for Nexus 3. This is better than an emptyDir because it will allow us to delete and reinstall Nexus 3 without losing data.

Volumes

The principle I am trying to follow here is to know exactly where my data is so that I can lose a kubernetes cluster due to (my own) error, but never lose the data and always able to setup everything from scratch again. I also go that far not to use storage classes and provisioners in my setup and in practice use labeled nodes and host path volumes, tying the storage explicitly to a specific directory on a specific node.

First of all I am labeling one node where I want the volume to be:

kubectl label node weasel wamblee/type=production

Next, I am using the following volume definitions:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: nexus-data
  labels:
    type: local
    app: nexus
spec:
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: wamblee/type
          operator: In
          values:
          - production
  claimRef:
    name: nexus-data-nexus-0
    namespace: wamblee-org
  persistentVolumeReclaimPolicy: Retain
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/data/nexus"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nexus-data-nexus-0
  namespace: wamblee-org
spec:
  storageClassName: ""
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Through the nodeAffinity construct I am tying the volume to a specific node. Note the name of the PersistentVolumeClaim which is determined by the StatefulSet name and the PersistentVolumeClaim in the stateful set definition together with the sequence number of the pod in the stateful set.

Services

To expose Nexus3 we need two services:

apiVersion: v1
kind: Service
metadata:
  name: nexus
  namespace: wamblee-org
spec:
  selector: 
    app: nexus-server
  type: ClusterIP
  ports:
    - port: 8081
      targetPort: 8081
---
apiVersion: v1
kind: Service
metadata:
  name: nexus-docker
  namespace: wamblee-org
spec:
  selector: 
    app: nexus-server
  type: ClusterIP
  ports:
    - port: 8082
      targetPort: 8082

Here I am using one service for every port. The Service is of type ClusterIP to avoid direct access from outside the cluster.

Ingress

The service is finally exposed through an Apache server running in the exposure namespace using the following apache config:

<VirtualHost *:80>
  ServerName mynexus.wamblee.org # actual host name is different

  ProxyRequests off
  ProxyPreserveHost on
  AllowEncodedSlashes on

  ProxyPass / http://nexus.wamblee-org.svc.cluster.local:8081/ disablereuse=On
  ProxyPassReverse / http://nexus.wamblee-org.svc.cluster.local:8081/
</VirtualHost>

<VirtualHost *:80>
  ServerName mydockerrepo.wamblee.org # actual host name is different 

  ProxyRequests off
  ProxyPreserveHost on
  AllowEncodedSlashes on

  ProxyPass / http://nexus-docker.wamblee-org.svc.cluster.local:8082/ disablereuse=On
  ProxyPassReverse / http://nexus-docker.wamblee-org.svc.cluster.local:8082/
</VirtualHost>

Note that the apache configuration uses the DNS local service name of Nexus. The Ingress rule is already defined, see earlier post, and provides SSL termination with automatic certificate management.

One thing that is important to know is that if you are deleting the Nexus services and deploying them again, then the backend services will get a new IP address. In the mean time, Apache will by default cache the DNS lookup of the services during the lifetime of the worker. As a result, Apache may never pickup the changes. One quick fix is to kubectl into the httpd container and do a apachectl graceful, forcing a reload of the workers. Another option is to set disablereuse=On which disables caching of connections to the backend services. That way, changes are picked up immediately. However, I wouldn’t use that in any serious production setup, but for home use it is ok. For production a different setting like  MaxConnectionsPerChild 100 would be better, forcing the recycling of workers after a small number of requests, or triggering the apache reload.

Nexus docker setup

There are some details for setting up a hosted docker registry on Nexus that are important to setup in the Nexus admin interface:

  • In Security/Realms, enable the “Docker bearer token realm”. Without this you can never authenticate
  • For configuring read/write access for  user you must create a user with the nx-repository-view-*-* role.
  • A separate port (in this example 8082) must be configured for the hosted docker repository where it will listen on. Each hosted docker repository on Nexus must have its unique port.

After this setup, you can check the installation using docker login, tagging an image with your docker repo hostname and pushing it. Also, verify pulling an image on Kubernetes by configuring a registry credentials secret and using that in a pod definition to run a pod.

Nexus RPM setup

To setup RPM for RPM repositories, create a hosted Yum repository. I am using repodata depth 0, layout policy permissive, and strict content type validation off.  This way, I can continue to use the rpm-maven-plugin to build RPMs together with the maven-release-plugin to publish RPMs to Nexus. Deployed RPMs will become available for use within minutes after deploy. Using repodepth of 0, I have no restrictions on path names and will effectively host a single YUM repository in the hosted YUM repository.

To migrate RPM artifacts from the old Nexus 2 to Nexus 3, I simply change directory to the RPM repository on the webserver where it is currently hosted and push each RPM to nexus 3 using a simple script which is executed from the directory of the RPM repo:

#!/bin/bash

nexushost=mynexus.wamblee.org
repo=rpms
subdir=nexus2
userpwd="USER:PWD"

while read rpm
do
  echo $rpm
  name="$( basename "$rpm" .rpm )"
  echo "  $name"
  echo "Deploying $rpm"
  curl -v --user "$userpwd" --upload-file "$rpm" https://$nexushost/repository/$repo/$subdir/$name.rpm
done < <( find . -name '*.rpm' )

I remember using a more advanced migration from Nexus 2 to Nexus 3 at work where we had a lot more data, downtime requirements, and also a lot of java artifacts in Nexus 2. That procedure was quite complex. In the current case however,  where the amount of data is small and downtime requirements non-existent, this simple approach is the way to go.

For client side yum repo configuration, username and password can be encoded in the repo file as follows:

baseurl=https://USER:PWD@mynexus.wamblee.org/repository/rpms

or on more modern systems using

username=USER
password=PWD
baseurl=https://mynexus.wamblee.org/repository/rpms
This entry was posted in Devops/Linux. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *