Author Archives: erik

Moving

The last time I moved to a different city was 13 years ago. And before that time I had been moving every two years or so. So when I finally settled in 1998, I decided that I was going to stay in one place for a much longer time. It is time now however to move again, I got a new job in a new location and it makes a lot of sense to move. For one, I will have much better house (buying a house in the middle of the credit crunch), with a very nice garden, and it will reduce my traveling time to and from work considerably. Also, the environment is quite nice because my favorite mountainbiking locations are closer and there are also many more opportunities for mountainbiking close by.

One of the most important things when moving is of course…. my server. Of course, I am depending a lot on it. For one it is running my mail server and it also handles a number of mailing lists. It runs 4 web sites, and it is also my VCR (mythtv).

Therefore, it is important to me to minimize downtime of the server during the move. Luckily, I am already prepared for this since I am running the server as a virtual machine already. So as part of the move I will run this virtual machine on my laptop, which gives me plenty of time to disassemble the server rack and set it all up again at my new location. In fact as I am writing this, I am already running the server from my laptop. It is easy for me to do this because my regular server backups are bootable, see here.

Because of this setup, I can minimize the total down time of my web sites to the order of minutes and minimize mail down time to less than possibly one day in total (but no-one will notice that because mail servers retry sending mail).

Interestingly, I had quite a fight today to get things working again with my TVIX M-6500 which allows me to play movies hosted on the server (through NFS) on my TV. As it turns out there are subtle issues with network bridges on linux dropping UDP packages in some cases, see here.  As it turns out, the TVIX uses UDP for NFS, which can give problems with bridged network interfaces on virtual machines in some cases. Luckily, I managed to solve this by replacing the virtio network model on the machine by device emulation of a RTL8139 chipset. Anyway, all is  good now. The server VM is now fully functional again and I can watch movies, send/receive mail and all my websites are up. The only thing I cannot do is record at this time, but ok, this is only for the next 10 days or so. On the 16th of February I hope to be able to start the server again at its new location.

Posted in Uncategorized | 1 Comment

Nested Logical Volume Management for VMs

As I blogged earlier, I have replaced the server setup that I originally had with a virtualized server setup. This introduces the concept of “hardware independent server” and makes it easy to run the server on any hardware without modification. More concretely, it allows me to run until the hardware fails. Previously I used to replace the server hardware before it really broke, but in this setup I can run it until it breaks. Should I have a serious hardware failure I can simply run the server(s) from any other hardware such as a laptop. This is because I have “bootable backups”. I.e. if the server breaks, I can either run a replacement server based on the same data or simply use a laptop and run the backup in a virtualized manner.

As part of the original migration from running native to virtualized I used the identical setup, which meant passing physical hardware partitions to the virtual machine. The virtual machine then used Linux Logical Volume Management based on these hardware partitions. For new virtual machines I used another approach which was allocating a disk logical volume on the host, and then partitioning this on the guest and using LVM again to manage storage within the guest. This in fact results in nested logical volume management and as I have seen from one of the new virtual machines works like a charm. It provides a nice separation of concerns where the host simply assigns storage to guests and the guests decide how to use this storage.

However, there was still one virtual machine (the original hardware based server) that was still being passed physical disk partitions. This introduced the problem of both the host and virtual machine seeing the same logical volumes and thus the chances for administrative error and data corruption when multiple OSes would concurrently access the same logical volumes.

To remedy this, I used the following procedure:

  • Allocate a physical volume on the host and a “disk” logical volume on it big enough to contain all logical volumes from the VM
  • Stop the VM
  • Add this virtual disk to the VM.
  • Start the VM
  • Partition the new disk on the VM and extend existing volume groups to use physical partitions on this disk.
  • Use pvmove to move data to the disk and remove the old unused physical partitions from the volume groups afterwards.
  • Stop the VM
  • Remove old physical partitions from the VM, leaving only the new “disk” logical volume
  • Start the VM

In executing this procedure I ran into the basic problem that I did not have enough storage. To solve this I used a separate disk that was connected temporarily to the server. Now, after executing this procedure, all physical storage on the existing logical volumes (RAID array) was unused, so I extended the logical volume for the disk with that from the RAID array on the host. Then again using pvmove to move data to the RAID array from the temporary disk. And afterwards removing the unused physical volumes on the temporary disk from the volume group. Of course, all done while the virtual machine was up and running (no-one likes downtime).

The new setup reduces the chance of administrative error considerably and allows me to move storage for virtual machines to other locations without even having to shutdown a virtual machine. It also nicely separates the allocation of storage to VMs on the host from how each VM uses its allocated storage.

Posted in Server/LAN | Leave a comment

Improvements to Snapshot Backup Scripts

The snapshot scripts that I blogged about earlier have undergone a number of important changes. I had been having a lot of problems with the cleanup of snapshot volumes and with the deletion of the old backup logical volumes. This was all related to this bug. After applying the workarounds there, the backup procedure is completely robust again.

Also, I have added improved logging together with scripts to check the result of a backup.
Additionally, the software is now also available on an RPM repository (works at least on opensuse 11.3).

For more information, have a look at the snapshot website.

Posted in Uncategorized | Leave a comment

Git server setup on linux using smart HTTP

After seeing a presentation from Linus Torvalds I decided to read more about git. After looking into git more a I have decided to slowly move to git. New projects will be using git instead of subversion and I will move some existing projects over to git when I get the chance.

The first question that arises in such a case is how to deploy it. Of course, there are free solutions available such as github, but this some disadvantages for me. First of all, the access will be slower compared to a solution that I host at home, and second, I also have private repositories and these are really private so I really don’t want them to be hosted on github (even if github protects the data). Apart from this, the distributed nature of git would allow me anyway to easily put source code for an open source project on github, should one of my projects ever become popular.

So the question remains on how to host it at home. Of course, I have my current infrastructure already consisting of a linux server running apache. Looking at the options for exposing GIT, there are several solutions:

Alternative Pros Cons
ssh

Remote access through ssh.
  • Zero setup time because ssh is already running
  • requires complete trust of a client. Possible version incompatibilities.
  • requires a system account for every user and additional configuration to prevent logins and other types of access.
  • (corporate) firewalls can block SSH making it inaccessible from there.
apache webdav

remote access through apache using webdav
  • easy to setup, simple apache configuration
  • uses proven apache stability and security
  • additional configuration required in git to make this work (git update-server-info).
  • requires complete trust of a client, same risk with version incompatibilities
  • definite performance impact.
apache smart http

remote access using apache with CGI based solution (basically using HTTP as transport for git).
  • easy to setup
  • uses proven apache stability and security
  • does not require trusting a particular client
  • some overhead of HTTP (alhough much less than with webdav)
git native
  • doesn’t require trusting a client.
  • most efficient solution
  • does not easily pass through firewalls
  • server code maturity
  • lack of authentication.

In the above table, the phrase “trusting a client” means trusting the client software. Allowing a client full control over the modification of the repository files is risky. There could be clients with bugs or clients using different versions of git and there could even be malicious clients that could corrupt a repository. This risk is not present with the native git and smart http approaches.

As is clear from the above, ssh is the most problematic of all. On the other end, git is the most efficient but lacks the requires security and requires me to open up yet another port on the firewall and run yet another service. Because of these reasons I decided on an HTTP based setup. In fact, I experimented early on with the webdav based approach simply because I didn’t find the smart http approach which is relatively new. The setup however did show that HTTP webdav is much slower than the smart http setup. In fact, I think smart http is also faster than subversion when pushing changes.

The setup of smart HTTP is quite easy, basically it is a CGI based approach where HTTP requests are delegated to a CGI helper program which does the work. In effect, this is the git protocol over HTTP. Standard apache features are used to implement authentication and authorization. The smart HTTP approach is described already quite well here and here, but I encountered some issues and would like to clarify what I did to get it working.

These are the steps I took to get it working on opensuse 11.3:

  • Setup the user accounts and groups that you need to authenticate against using htpasswd and put it in a file /etc/apache2/conf.d/git.passwd
  • Make sure that the cgi, alias, and env modules are enabled by checking the APACHE_MODULES setting in /etc/sysconfig/apache2.
  • Now we are going to edit the apache configuration file for the (virtual) domain we are using. In this example, I assume we have /data/git/public hosting public repositories (anonymous read and authenticated write) and /data/git/private hosting private repositories (authenticated read and write). Also the git repositories are going to be exposed under a /git context root.
    • by default export all repositories that are found
      SetEnv GIT_HTTP_EXPORT_ALL

      This can also be configured on a per repository basis, see the git-http-backend page for details.

    • Configure the CGI program used to handle requests for git.
      ScriptAlias /git/ /usr/lib/git/git-http-backend/

      This directive had me quite puzzled because the apache documentation mentions that the second ScriptAlias argument should be a directory, but in this case it is an executable and it works.

    • Set the root directory where git repositories reside
      SetEnv GIT_PROJECT_ROOT /data/git
    • By default, the git-http-backend allows push for authenticated
      users and this directive tells the backend when a user is authenticated.
      SetEnv REMOTE_USER=$REDIRECT_REMOTE_USER

      I had to google a lot to find this one because it is not mentioned in the documentation. Without this, I had to configure “http.receivepack” to “true” for every repository to allow “git push”.

    • General CGI configuration allowing the execution of the CGI programs. This is more or less self explanatory
      <Directory "/usr/lib/git/">
        AllowOverride None
        Options +ExecCGI -Includes
        Order allow,deny
        Allow from all
      </Directory>
    • Next is the configuration of the public repositories
      <LocationMatch "^/git/public/.*/git-receive-pack$">
        AuthType Basic
        AuthName "Public Git Repositories on wamblee.org"
        AuthUserFile /etc/apache2/conf.d/git.passwd
        Require valid-user
      </LocationMatch>

      This requires an authenticated user for every push request. See the apache documentation for the various options such as requiring the user to belong to a group. In my setup, I simply use one global git.passwd file and any authenticated user has access to any repository.

    • Finally, there is the setup of the private repositories, which requires a valid user for any URL.
      <LocationMatch "^/git/private/.*$">
        AuthType Basic
        AuthName "Private Git Repositories on wamblee.org"
        AuthUserFile /etc/apache2/conf.d/git.passwd
        Require valid-user
      </LocationMatch>

      In this case I could have also used a “Location” element instead of “LocationMatch”.

    • Finally restart apache using
      /etc/init.d/apache restart

      or (“force-reload”) and try it out.

Hope this helps others setting up their git servers on linux. In my experience this setup is quite fast for both push and pull. I am currently working on one project that you can access by doing

  git clone https://wamblee.org/git/public/xmlrouter

A gitweb interface for browsing the public repositories is here.

Have a lot of fun!

Posted in Server/LAN | 9 Comments

Initial experiences with the Samsung Galaxy S II and Android

A few days ago, on May 11th, I received my new phone, the Samsung Galaxy S II. This is one of the first dual core phones runnning gingerbread. After a few days of working with it, I must say I am truly impressed with it. On the software side, the phone is rock-solid, really loads better then my previous Nokia N97 and (the absolutely terrible) Sony Ericsson P990i (it used to reset ‘to improve system performance’ in standby mode in the initial software version). It is nice to use a phone that just works. I haven’t even discovered a single glitch. Nokia and Sony Ericsson should take note here.

Even making calls is better than on the N97. On that phone you lose control completely everytime someone else hangs up: the screen would go black and you would not be able to do anything on it for the next 10 seconds. It is also nice to have a music application that actually performs.


It’s too early to say anything about battery life as my use of the phone has been extreme for the past days, which included continuous downloading over wifi for hours on end and almost continuous use.

The phone feels really solid and looks great. Performance is excellent. With this phone I have the feeling that finally we have similar performance and usability again as with the good old Palm from approx. 6 years ago. It is also nice to know that the phone has Gorilla glass and uses the latest Sirf Star IV chip for GPS. All in all a quality product.

Looking at Android and in particular the Android market I am also impressed. The quality of the applications that I tried is quite good. One such application is a tuner (for tuning musical instruments). In the past it was difficult to find good applications for this. On Palm I used phontuner for instance but all the other applications sucked and I haven’t been able to find a suitable application on Symbian at all. On Android I have tried two which both worked quite well. The review system on the market makes it relatively easy to find good applications and saves a lot of time dealing with the bad ones. Buying stuff is also easy and fast.

Posted in Misc, Software | 2 Comments