KVM Setup Overview

The server has been running stable now for quite some time in the new setup with several virtual machines providing the actual functionality using Kernel Virtual Machine.

The setup is as follows. The host (falcon) is running a linux server and runs 3 virtual machines: shikra, sparrow, and windowsxp. These virtual machines are all running using KVM. The windowsxp VM is switched off most of the time and only runs when I need it. Its main purpose is that it contains some licensed software that cannot be transported to another windows installation because of licensing reasons.

The shikra image is basically the old server minus minus the continuous integration and maven functionality. Every linux virtual machine provides two network interfaces, one bridged interface for the outside world and one NAT interface for pure host-VM and VM-VM communication. The latter interface is mainly used for backups because in that case it is useful to minimize impacts on the external network interfaces. Sparrow is dedicated to automated builds and it provides the nexus repository for RPM generation. Having this functionality separate from the core server (shikra) is desirable so that automated builds cannot functionally impact shikra.

From the internet, all SSH traffic is forwarded to the host so I can always get into the server, even if a VM is having problems, and HTTP, HTTPS, IMAPS, and SMTP traffic is routed directly to shikra.

In the future I want to generalize this setup a bit more, by creating a separate VM for mythtv functionality. Also, I am considering to create a separate, very small, VM for just the reverse proxy.

As part of this setup I had to automate some tasks for starting up and shutting down VMs. This is provided by the kvmcustom package (see the yum repository) . Also see the post about automated management of this yum repo.

Posted in Devops/Linux | 1 Comment

Two worlds meet (1): Automated creation of Yum Repos with Maven, Nexus, and Hudson

This is the first of a series of blogs titled ‘Two worlds meet’ talking about how two technologies can be used together to solve a problem. Mostly one world will be linux or more specifically a virtualized linux setup using kernel virtual machine, with the other world being java.

In this blog I will be looking at automated creation of a yum repository using maven, nexus, and hudson. First however, some background is needed. Some time ago I bought a new server with more than sufficient resources to run multiple virtual machines. The aim there was to do some separation of concerns having virtual machines with different purposes and also be able to run conflicting software. Doing that introduces a whole new problem of maintaining the software on these virtual machines. Of course, I use a standard linux distribution such as opensuse but I still have some custom scripts that I need and want to have available on all VMs.

Using the standard linux tooling an abvious method is to just create my own Yum repository to publish my own RPMs in and then add that Yum repo as a channel in all of my VMs. Of course, the challenge is then to easily and automatically create such a YUM repository. Fortunately, since I am working quite a lot with Java and Maven (earning a living with it basically), there is a quite easy solution with a nice separation of concerns.

The ingredients of this solution are:

  • maven for building the rpm using the maven rpm plugin
  • the maven release plugin for tagging a released version of the RPM, stepping its version, and publishing it into a maven repository
  • a maven repository such as nexus for publishing RPMs into
  • hudson for detecting changes and automatically updating/building the Yum repository upon changes to the RPMs.

In addition, some basic infrastructure is needed such as:

  • a version control system such as subversion
  • apache for providing access to subversion and for serving the Yum repo to all VMs
  • an application server such as glassfish for running hudson and nexus

This may seem like a lot of infrastructure, but before I started I already had most of this except for the nexus maven repository, so all in all the effort for this solution was quite limited.

The main new ingredient of the solution is the script to create the Yum repository from the nexus repository. This script exploits the fact that nexus stores its repositories in a standard maven directory structure (an approach using REST web services is also possible):

#!/bin/bash

REPO=/usr/java/nexus/nexus/storage/rpms

# Create the repo directory
YUM=$HOME/yum
rm -rf $YUM
mkdir -p $YUM
mkdir -p $YUM/noarch

# Find the RPMs in the nexus repository and use hardlinks
@ to preserve modification times and avoid the overhead of
# copying
for rpm in $( find $REPO -name '*.rpm' )
do
  echo "RPM $rpm"
  ln $rpm $YUM/noarch
done

# createrepo is a standard command available on opensuse
# to create a Yum repository
createrepo $YUM

# sign it
gpg -a --detach-sign $YUM/repodata/repomd.xml
gpg -a --export F0ABC836 > $YUM/repodata/repomd.xml.key

# sync the results to their final destination to make them
# available
rsync --delete -avz $YUM/ /data/www/http.wamblee.org_yum/public

Using this approach it is really easy to update an RPM and make it available on all my VMs. The procedure is basically as follows:

  • edit the source of the RPMs and check in
  • Now tag it and step the versions using:
    mvn release:prepare
  • Deploy the just tagged version to the nexus repository:
    mvn release:perform
  • Some time later the Yum repository has been automatically updated by hudson based on the contents of the nexus repository
  • On a specific VM simply use a standard update using
    zypper up

    Note that this may require an explicit

    zypper refresh

    to make sure that zypper sees the latest versions of all RPMs. Autorefresh will also work but might require some more time before zypper sees the latest versions.

Therefore, in the end a really simply procedure to quickly make RPMs available on all VMs and also make sure each version is properly tagged in subversion. The only issue is that hudson will always run on every SCM change, so not only when an RPM is released but I consider that a minor issue.

The YUM repo is here.

An example pom is below:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd ">
    
    <parent>
        <groupId>org.wamblee.server</groupId>
        <artifactId>root</artifactId>
        <version>1.0.1</version>
    </parent>
    
    <modelVersion>4.0.0</modelVersion>
    
    <packaging>rpm</packaging>
    <groupId>org.wamblee.server</groupId>
    <artifactId>kvmguest</artifactId>
    <version>1.0.3-SNAPSHOT</version>
    <name>kvmguest</name>
    <description>KVM guest support</description>
    <organization>
        <name>org.wamblee</name>
    </organization>
    
    <build>
        <plugins>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>rpm-maven-plugin</artifactId>
                <version>2.0.1</version>
                <extensions>true</extensions>
                <configuration>
                    <changelogFile>CHANGELOG</changelogFile>
                    <copyright>Apache License 2.0, 2010</copyright>
                    <group>org.wamblee.server</group>
                    <packager>Erik Brakkee</packager>
                    
                    <mappings>
                        <mapping>
                            <directory>/usr</directory>
                            <filemode>755</filemode>
                            <username>root</username>
                            <groupname>root</groupname>
                            <sources>
                                <source>
                                    <location>files/usr</location>
                                </source>
                            </sources>
                        </mapping>
						<mapping>
                            <directory>/etc/kvmguest.d</directory>
                            <filemode>755</filemode>
                            <username>root</username>
                            <groupname>root</groupname>
                            <sources>
                                <source>
                                    <location>files/etc/kvmguest.d</location>
                                </source>
                            </sources>
                        </mapping>
						<mapping>
                            <directory>/usr/share/doc/packages</directory>
                            <filemode>444</filemode>
                            <username>root</username>
                            <groupname>root</groupname>
                            <sources>
                                <source>
                                    <location>files/usr/share/doc/packages</location>
                                </source>
                            </sources>
                        </mapping>
                    </mappings>
                    <provides>
                       <provide>kvmguest</provide>
                    </provides>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

Posted in Devops/Linux, Java, Software | 10 Comments

Flexible JDBC Realm for Glassfish

Approximately three years a go I started the development of a simple JDBC based security realm for Glassfish. The reason was that I was migrating from JBoss to glassfish and was running into problems with one application. That application simpy stored authentication data (user, group, password digests) into a database. I had been relying on a simple configuration for this on JBoss but ran into limitations of JDBCRealm on Glassfish. Therefore, I made my own realm. It is now being used at several places already. With version 1.1 I consider this security realm feature complete. More info is at the web site.

Posted in Java, Software | 4 Comments

Move to Maven 3

Today I moved all my projects to maven 3.  The claim is that maven 3 is downwards compatible with maven 2, but have a look at the compatibility notes. The issues I ran into were:

  • Parent resolution: Parents are no longer resolved as part of the build cycle. Instead a parent POM must either be in the parent directory or <relativePath> must be specified for the parent POM. In one project I was referring to a parent two levels up. Simply corrected this by referring to the direct parent. The advantage of this new Maven rule is that it will allow module builds without having to always build the parent first.
  • Maven site support: This was more major. I had to follow instructions at the plugin site to get it working again. This change also breaks compatibility with maven 2 so projects that have a maven site will only build with maven 3 now.
  • Legacy-style repo support: I had one dependency for toplink-essentials that was resolved from a maven 1 project. This dependency was located in a specific test project. To solve this, I excluded this project from the build. Result of this is that I still have toplink-essentials support for my JPA testing framework, but the tests are no longer run automatically. Is there anyone still using toplink essentials? Anyway, I intend to run a local artifactory repo in the near future so then I can add this test project again because artifactory can provide a maven 2 interface for a legacy repo.

Another issue I had was that the maven site plugin with maven 3 requires a lot more memory on a multi-module project than with maven 2. I worked around this by invoking the separate maven site generations in the multi-module project individually. This not only reduced memory requirements but also speeded up the build considerably.

It is important for me to keep memory requirements low because I  have to take into account that I might someday need to run the server virtual domains on a system with much lower memory. That would be in the case of hardware problems.

There definitely needs to be a redesign of the maven site plugin to make it more usable for multi-module builds.

Posted in Java, Software | Leave a comment

New Server Setup is Complete!

Over the past weeks I have spent huge amounts of time in setting up the new server, making sure that I preserve the complete functionality that I had before. My aim was to take my old server (i.e. server installation) which is 32 bit and run that one as one of the guests on the new server (64 bit) with the guest upgraded to 64 bit as well.

Experimentation with KVM

This turned out to be no easy task. For one, I had to do a lot of experimentation with KVM do get to know it and do benchmarking to determine what the server setup should be like. This was done on the laptop. Then the parts for the server arrived and then I had to assemble it. This was in fact easier than I thought, but this also involved a lot of stability testing with in particular trying to determine whether or not I had a cooling problem (turned out everything was cool from the start).

PCI Passthrough of the TV card

Then the next phase of the server setup started and I tried to use PCI passthrough of my wintv PVR-500 card to a VM, which failed horribly because of shared IRQs. I even ordered and tried out other TV cards that were based on USB. The first card I ordered did not work because it turned out it had hardware that was not supported (this was unexpected since the card was listed as supported but the manufacturer changed the hardware specs without changing the product number).  Then I got one that actually worked, including USB passthrough to the guest. However, that resulted in jerky video and audio because, as it turns out, KVM only supports USB 1.1 which is too slow for this. Looking around a little it  turns out that in virtualization USB 2.0 is still hot and usually found only in commercial offerings. I even tried Xen again but quickly gave up on that because of its huge complexity.

Stability Problems

To make progress I decided I might just as well run the old server non-virtualized and use KVM to add other guests on top of that. So I setup the server in this way and installed it in the server rack. Then, the next problems occurred. The server didn’t even survive one night of running a backup and was dead in the morning. The next attempt at a backup also failed. Then I looked into the BIOS event log and I saw loads of messages occurring (at least 20 per hour). This was no good, but with help of customer support from supermicro a new firmware install for the IPMI and BIOS removed most of the events.

Still I got a lot of events, mostly about fans, which was no good. Then I remembered that I had used linux fancontrol to control fan speed. But as it turned out this resulted in conflicts with the IPMI which was also monitoring fan speed. Disabling the fan control again provided a great improvement but still events occured a couple of times per day. A closer look revealed that this occurred at exactly 10 minute boundaries which as exactly the interval with which I was monitoring the system using linux sensors with the w83795 driver. As it turns out, it is impossible to read sensor settings concurrently and this resulted in strange fan readings at some points in time. Disabling the sensors style monitoring fixed that problem as well. So after all of this I had a stable server.

PCI Passthrough Solved

Then, still not satisfied with the crummy server setup with the old server running as host and not as a guest, I was discussing my issue with people on the KVM mailing list. Luckily that provided some suggestions on how to solve it. It turnes out that shared IRQs between the ivtv driver and USB and serial ATA were the cause of the problem. However, it is possible to unbind USB PCI devices in linux and using this I could remove shared IRQ conflicts between ivts and USB. This resulted in a succesful PCI passthrough of on of the tuners on the PVR-500 card but not the other one because of a shared IRQ with ATA.

Again a number of days later I got the idea to look at SATA configuration in the BIOS and it turns out I could configure AHCI or Intel RAID instead of the default SATA and that effectively removed the conflicts with ATA resulting in a succesful PCI passthrough of both tuners on the TV card. So, after all of this I could run the old 32 bit server purely virtualized.

Other Challenges

However running a virtualized server provides some challenges such as automated start, stop, network configuratino, firewalling, and backup. Backup in particular was a challenge because it is impossible to add a new disk to a running server and I wanted to reuse my existing backup solution. To do this, USB was not a solution because of the limitations with 1.1. Luckily howeve iSCSI works quite well and provided exactly what I needed. The only thing was that the linux community apparently changed their minds on the iscsi target implementation so I had to get to no tgtd instead of iscsi-target. Even though that one was designed to be easier to configure the command-line was still challenging enough so I wrote some scripts to make it even easier for myself.

The End

So now I have everything running the way I wanted to from the start. Feels good!

Posted in Devops/Linux, Software | Leave a comment

Sticky 911! Making it easy to quickly and reliably boot your linux OS from USB or CD/DVD

I have been isolinux in the past to boot my server. In fact, it used to be the only way to boot it because somehow my BIOS did not recognize the RAID card. That problem was solved later by a newer BIOS, but still there was a need to be able to boot the system when the boot sector was lost or to repair or restore things.

To solve this I had a boot cd under version control but still maintenance of the boot cd was a nightmare. One problem was the short names required by isolinux which resulted in initrd names such as ‘rd111p’ and linux kernel names such as ‘l111p’. Not really convenient if you just wnat ot be sure the the version you have on your isolinux CD/DVD is the same as the one you have installed. A typical vmlinuz name is much more descriptive such as vmlinuz-2.6.34.7-0.5-default.

In addition, it was quite a hassle to always burn a new CD/DVD every time there was a kernel update. So I decided to kill two birds with one stone and automate the whole process and in addition provide support for booting from USB (the new server supports it).

This resulted in a new ‘project’ which I call sticky911. The idea is simple, provide an XML based configuration of your system, eliminate duplication as much as possible and check the hell out of it, so that in the end you have a ‘first time right’ bootable USB stick or CD/DVD based on isolinux.

Posted in Devops/Linux, Software | 2 Comments

Java? Java bien, merci!

This is how anyone’s first French lesson should start!

Posted in Java, Software | Leave a comment

Kernel Virtual Machine (KVM) Benchmark Results

General Approach

Over the past week, I have been doing benchmarking on KVM performance. The reason for this is that I want to use KVM on my new server and need to know what the impact of KVM is on performance and how significant the effect of certain optimizations is.

Googling about KVM performance, I found out that a number of optimizations can be useful for this:

  • huge pages: By default, linux uses 4K memory pages but it is possible to use much larger memory pages as well (i.e. 2MB) for virtual machines, increasing performance, see for instance here for a more elaborate explanation.
  • para-virtualized drivers (virtio): Linux includes virtio which is a standardized API for virtualized drivers. Using this API, it becomes possible to (re)use the same para-virtualized drivers for IO for different virtualization mechanism or different versions of the same vritualization mechanism. Para-virtualized drivers are attractive because they eliminate the overhead of device emulation. Device emulation is the approach whereby the guest emulates real existing hardware  (e.g. RTL8139 network chipset) so that a guest can run unmodified. Para-virtualized drivers can be used for disk and/or network.
  • IO Scheduler (elevator): Linux provided the completely fair queueing scheduler (CFQ), deadline scheduler, and noop scheduler. The question is what an optimal scheduler for the host is in combination with that for the guest.

The tests will include specific benchmarks focused on disk and network, s well as more general benchmarks such as unixbench and a kernel compilation..

Posted in Devops/Linux | 5 Comments

Benchmarking KVM continues…

After running quite a few tests with different configurations and doing also some manual testing, I am finding out that more or less the main factor in performance of KVM versus native is the disk IO. In particular, the schedulilng on both host and guest seems to have a significant effect. What it looks like is even that, contrary to wisdom and rationale on the internet, it is not more efficient to use noop as scheduler for IO in the guest.

Unfortunately though I had to fix an issue in the VMs related to disk partitioning and I am now not 100% sure anymore of the settings that I used to run the tests. So I need to rerun tests and it is also time for a more systematic approach now that I have preliminaray results. In fact, I will now automate the entire testing even further so I can run multiple configurations sequentially and also verify a larger number of combinations.

To be continued.

Posted in Devops/Linux | 1 Comment

Oh no, I’ve created a monster!

I have started to do benchmarking with virtual machines under opensuse 11.3 with KVM and had setup a number of different domains to run the same OS under different settings. Just one minor issue occurred which was that some domains with device emulation for the disk wouldn’t startup anymore. Very strange since I tested this before, starting from a working setup with disk emulation and then switching to the virtio drivers.

After some troubleshooting it appears it has to do with the drivers that are loaded at boot time through the init ram disk. Apparently, when creating the init ram disk under an OS with virtio drivers, it does not automatically load the drivers for IDE and SCSI, so disk emulation does not work.

The problem is easily illustrated by the output of mkinitrd on the guest using IDE emulation:

Kernel image:   /boot/vmlinuz-2.6.34.7-0.5-default
Initrd image:   /boot/initrd-2.6.34.7-0.5-default
Kernel Modules: thermal_sys thermal scsi_mod libata ata_piix ata_generic processor fan virtio virtio_pci virtio_ring virtio_net virtio_blk dm-mod dm-snapshot crc16 jbd2 ext4 pata_sl82c105 pata_hpt3x2n sata_mv pata_sch pata_netcell pata_acpi pata_sc1200 pata_it8213 sata_vsc pata_serverworks sata_via pata_ns87415 ahci pcmcia_core pcmcia pata_pcmcia pata_mpiix pata_jmicron pata_piccolo sata_svw sata_inic162x pdc_adma pata_atp867x pata_ali pata_hpt3x3 pata_efar pata_marvell pata_sil680 pata_cs5530 pata_pdc202xx_old sata_sil pata_it821x sata_sil24 pata_cypress pata_opti sata_promise sata_nv pata_optidma pata_sis pata_hpt37x pata_cmd640 pata_artop pata_amd sata_qstor sata_uli pata_cs5520 sata_sis pata_radisys pata_rz1000 sata_sx4 pata_cmd64x pata_ns87410 pata_triflex pata_hpt366 pata_ninja32 pata_via pata_rdc pata_atiixp pata_pdc2027x pata_oldpiix sd_mod usbcore mmc_core ssb ohci-hcd ehci-hcd uhci-hcd usbhid linear
Features:       dm block usb lvm2 resume.userspace resume.kernel
Bootsplash:     openSUSE (800×600)

compared to the output on the same guest using virtio drivers:

Kernel image:   /boot/vmlinuz-2.6.34.7-0.5-default
Initrd image:   /boot/initrd-2.6.34.7-0.5-default
Kernel Modules: thermal_sys thermal scsi_mod libata ata_piix ata_generic processor fan virtio virtio_pci virtio_ring virtio_net virtio_blk dm-mod dm-snapshot crc16 jbd2 ext4 usbcore pcmcia_core pcmcia mmc_core ssb ohci-hcd ehci-hcd uhci-hcd usbhid linear
Features:       dm block usb lvm2 resume.userspace resume.kernel
Bootsplash:     openSUSE (800×600)

I have configured the system to always include the virtio drivers by adding “virtio_blk” and “virtio_net” to /etc/sysconfig/kernel, so the initrd created on the guest with IDE emulation also works on the guest with virtio but not the other way around.

The solution/workaround here is quite simply to create a so-called monster initrd (mkinitrd -A) that has all available drivers available in the ram disk. Perhaps it doesn’t seem that nice, but the initrd is still only 20MB large compared to 6MB for a ‘normal’ initrd. So I created a monster!

Posted in Devops/Linux | Leave a comment