As part of migrating all the stuff I have from virtual machines to a kubernetes infrastructure, some important pieces of infrastructure are needed. These are:
- RPM repository: I use custom RPM repositories for setting up virtual machines. These same RPMs are used for building container images that are required by kubernetes
- Docker repository: Custom docker images may be required for kubernetes
Since I want to run everything at home and make minimal use of internet services for my setup, I need to deploy solutions for this on my kubernetes cluster. Currently, I already use an RPM repository based on Nexus 2. In the mean while, a lot has happened. For instance Nexus 3 now natively supports RPM repositories and it also supports docker repositories. Therefore, as part of the setup, I need to run Nexus 3 and move all my RPM artifacts over from Nexus 2 to Nexus 3.
Deploying nexus
Nexus will be running in the wamblee-org namespace on kubernetes and an apache reverse proxy in the exposure namespace will be used for exposing it. See my earlier post for details about this setup. Note that that post is specifically about GKE but the main ideas apply here as well.
For deploying Nexus 3, the instructions for the Nexus 3 docker image can be used. When deploying the Nexus service, this will require a Service and a StatefulSet for deploying Nexus. A StatefulSet is more appropriate here than a Deployment since Nexus is a stateful service. This means that each pod of the StatefulSet will have its own unique storage. In my deployment I will use a replica count of 1 since I am running Nexus 3 OSS and high availability is not of concern for my home kubernetes setup.
First of all, let’s look at the StatefulSet:
apiVersion: apps/v1 kind: StatefulSet metadata: name: nexus namespace: wamblee-org spec: serviceName: nexus replicas: 1 selector: matchLabels: app: nexus-server template: metadata: labels: app: nexus-server spec: containers: - name: nexus image: sonatype/nexus3:3.40.1 resources: limits: memory: "4Gi" cpu: "10000m" requests: memory: "2Gi" cpu: "500m" ports: - containerPort: 8081 - containerPort: 8082 volumeMounts: - name: nexus-data mountPath: /nexus-data volumeClaimTemplates: - metadata: name: nexus-data spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi
There are several things to remark:
- Replica count is 1. This is not a high available Nexus 3 deployment
- It uses the standard Nexus 3 image with a fixed version so that we are guaranteed not to get surprise upgrades.
- Two container ports are exposed: 8081 and 8082. The first port is dedicated to the web interface and interface used for maven artifacts and RPMs. The second port is specifically for Docker. Each hosted docker repository will have its own unique port. This is required by nexus. When exposing these repos externally, different host names will be used for port 8081 and port 8082 respectively.
- I am running in the wamblee-org namespace which is the namespace where I am hosting everything for the wamblee.org domain.
- A separate PersistentVolumeClaim is used for Nexus 3. This is better than an emptyDir because it will allow us to delete and reinstall Nexus 3 without losing data.
Volumes
The principle I am trying to follow here is to know exactly where my data is so that I can lose a kubernetes cluster due to (my own) error, but never lose the data and always able to setup everything from scratch again. I also go that far not to use storage classes and provisioners in my setup and in practice use labeled nodes and host path volumes, tying the storage explicitly to a specific directory on a specific node.
First of all I am labeling one node where I want the volume to be:
kubectl label node weasel wamblee/type=production
Next, I am using the following volume definitions:
apiVersion: v1 kind: PersistentVolume metadata: name: nexus-data labels: type: local app: nexus spec: nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: wamblee/type operator: In values: - production claimRef: name: nexus-data-nexus-0 namespace: wamblee-org persistentVolumeReclaimPolicy: Retain capacity: storage: 10Gi accessModes: - ReadWriteOnce hostPath: path: "/data/nexus" --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: nexus-data-nexus-0 namespace: wamblee-org spec: storageClassName: "" accessModes: - ReadWriteOnce resources: requests: storage: 10Gi
Through the nodeAffinity construct I am tying the volume to a specific node. Note the name of the PersistentVolumeClaim which is determined by the StatefulSet name and the PersistentVolumeClaim in the stateful set definition together with the sequence number of the pod in the stateful set.
Services
To expose Nexus3 we need two services:
apiVersion: v1 kind: Service metadata: name: nexus namespace: wamblee-org spec: selector: app: nexus-server type: ClusterIP ports: - port: 8081 targetPort: 8081 --- apiVersion: v1 kind: Service metadata: name: nexus-docker namespace: wamblee-org spec: selector: app: nexus-server type: ClusterIP ports: - port: 8082 targetPort: 8082
Here I am using one service for every port. The Service is of type ClusterIP to avoid direct access from outside the cluster.
Ingress
The service is finally exposed through an Apache server running in the exposure namespace using the following apache config:
<VirtualHost *:80> ServerName mynexus.wamblee.org # actual host name is different ProxyRequests off ProxyPreserveHost on AllowEncodedSlashes on ProxyPass / http://nexus.wamblee-org.svc.cluster.local:8081/ disablereuse=On ProxyPassReverse / http://nexus.wamblee-org.svc.cluster.local:8081/ </VirtualHost> <VirtualHost *:80> ServerName mydockerrepo.wamblee.org # actual host name is different ProxyRequests off ProxyPreserveHost on AllowEncodedSlashes on ProxyPass / http://nexus-docker.wamblee-org.svc.cluster.local:8082/ disablereuse=On ProxyPassReverse / http://nexus-docker.wamblee-org.svc.cluster.local:8082/ </VirtualHost>
Note that the apache configuration uses the DNS local service name of Nexus. The Ingress rule is already defined, see earlier post, and provides SSL termination with automatic certificate management.
One thing that is important to know is that if you are deleting the Nexus services and deploying them again, then the backend services will get a new IP address. In the mean time, Apache will by default cache the DNS lookup of the services during the lifetime of the worker. As a result, Apache may never pickup the changes. One quick fix is to kubectl into the httpd container and do a apachectl graceful, forcing a reload of the workers. Another option is to set disablereuse=On which disables caching of connections to the backend services. That way, changes are picked up immediately. However, I wouldn’t use that in any serious production setup, but for home use it is ok. For production a different setting like MaxConnectionsPerChild 100 would be better, forcing the recycling of workers after a small number of requests, or triggering the apache reload.
Nexus docker setup
There are some details for setting up a hosted docker registry on Nexus that are important to setup in the Nexus admin interface:
- In Security/Realms, enable the “Docker bearer token realm”. Without this you can never authenticate
- For configuring read/write access for user you must create a user with the nx-repository-view-*-* role.
- A separate port (in this example 8082) must be configured for the hosted docker repository where it will listen on. Each hosted docker repository on Nexus must have its unique port.
After this setup, you can check the installation using docker login, tagging an image with your docker repo hostname and pushing it. Also, verify pulling an image on Kubernetes by configuring a registry credentials secret and using that in a pod definition to run a pod.
Nexus RPM setup
To setup RPM for RPM repositories, create a hosted Yum repository. I am using repodata depth 0, layout policy permissive, and strict content type validation off. This way, I can continue to use the rpm-maven-plugin to build RPMs together with the maven-release-plugin to publish RPMs to Nexus. Deployed RPMs will become available for use within minutes after deploy. Using repodepth of 0, I have no restrictions on path names and will effectively host a single YUM repository in the hosted YUM repository.
To migrate RPM artifacts from the old Nexus 2 to Nexus 3, I simply change directory to the RPM repository on the webserver where it is currently hosted and push each RPM to nexus 3 using a simple script which is executed from the directory of the RPM repo:
#!/bin/bash nexushost=mynexus.wamblee.org repo=rpms subdir=nexus2 userpwd="USER:PWD" while read rpm do echo $rpm name="$( basename "$rpm" .rpm )" echo " $name" echo "Deploying $rpm" curl -v --user "$userpwd" --upload-file "$rpm" https://$nexushost/repository/$repo/$subdir/$name.rpm done < <( find . -name '*.rpm' )
I remember using a more advanced migration from Nexus 2 to Nexus 3 at work where we had a lot more data, downtime requirements, and also a lot of java artifacts in Nexus 2. That procedure was quite complex. In the current case however, where the amount of data is small and downtime requirements non-existent, this simple approach is the way to go.
For client side yum repo configuration, username and password can be encoded in the repo file as follows:
baseurl=https://USER:PWD@mynexus.wamblee.org/repository/rpms
or on more modern systems using
username=USER password=PWD baseurl=https://mynexus.wamblee.org/repository/rpms