{"id":522,"date":"2010-11-28T20:00:05","date_gmt":"2010-11-28T18:00:05","guid":{"rendered":"https:\/\/brakkee.org\/site\/?p=522"},"modified":"2010-11-28T20:00:05","modified_gmt":"2010-11-28T18:00:05","slug":"kernel-virtual-machine-kvm-benchmark-results","status":"publish","type":"post","link":"https:\/\/brakkee.org\/site\/2010\/11\/28\/kernel-virtual-machine-kvm-benchmark-results\/","title":{"rendered":"Kernel Virtual Machine (KVM) Benchmark Results"},"content":{"rendered":"<p><strong>General Approach<\/strong><\/p>\n<p>Over the past week, I have been doing benchmarking on KVM performance. The reason for this is that I want to use KVM on my new server and need to know what the impact of KVM is on performance and how significant the effect of certain optimizations is.<\/p>\n<p>Googling about KVM performance, I found out that a number of optimizations can be useful for this:<\/p>\n<ul>\n<li><strong>huge pages<\/strong>: By default, linux uses 4K memory pages but it is possible to use much larger memory pages as well (i.e. 2MB) for virtual machines, increasing performance, see for instance <a href=\"http:\/\/fedoraproject.org\/wiki\/Features\/KVM_Huge_Page_Backed_Memory\">here<\/a> for a more elaborate explanation.<\/li>\n<li><strong>para-virtualized drivers (virtio)<\/strong>: Linux includes <a href=\"http:\/\/lwn.net\/Articles\/239238\/\">virtio <\/a>which is a standardized API for virtualized drivers. Using this API, it becomes possible to (re)use the same para-virtualized drivers for IO for different virtualization mechanism or different versions of the same vritualization mechanism. Para-virtualized drivers are attractive because they eliminate the overhead of device emulation. Device emulation is the approach whereby the guest emulates real existing hardware\u00a0 (e.g. RTL8139 network chipset) so that a guest can run unmodified. Para-virtualized drivers can be used for disk and\/or network.<\/li>\n<li><strong>IO Scheduler (elevator)<\/strong>: Linux provided the completely fair queueing scheduler (CFQ), deadline scheduler, and noop scheduler. The question is what an optimal scheduler for the host is in combination with that for the guest.<\/li>\n<\/ul>\n<p>The tests will include specific benchmarks focused on disk and network, s well as more general benchmarks such as unixbench and a kernel compilation..<\/p>\n<p><!--nextpage--><\/p>\n<p><strong>Test Setup<\/strong><\/p>\n<p>All tests are executed on a Sony Vaio F11 laptop with 8GB of memory, with a Core i7 Q720 processor (default 1.6MHz).\u00a0 The hard disk is a 500GB 7200 RPM Seagate ST9500420AS. Tests are run at night or during the day (when I was at work). Runlevel 3 (multi-user with network) is used on both host and guest. This eliminates the overhead of the desktop environment on the host and the effects of user interaction.\u00a0 Each virtual machine was given 4GB of memory.<\/p>\n<p>To\u00a0 compare results on the host with those on the guest,\u00a0 tests on the host are carried out with half the memory used up of huge pages. This effectively reduces the available memory of the host to 4GB and makes it more comparable to a guest because normal applications cannot use the huge pages.<\/p>\n<p>Tests were done on Opensuse 11.3 with kernel version 2.6.34.7-0.5-default, both for host and guest. The only difference was that (obviously) the host has more software installed because it is used as a full-featured desktop.<\/p>\n<p>By default huge pages support was switched on and if not mentioned otherwise, the noop IO scheduler is used for the guest. The reason for this is that these are the optimal settings ss can be found on the internet. The noop scheduler seems to be a good choice as default for guests because the guest is unaware of the physical disks (it gets storage from the host), and the host can do a more efficient scheduling. The tests focus on showing what the effect of individual tuning parameters\u00a0 is on performance by deviating from this baseline.<\/p>\n<p>Disk based tests were carried out on an extended 4 (ext4) filesystem. Each guest gets a hard disk from the host which is a logical volume (as in LVM). The guest in turn uses a 512MB \/boot partition (non-LVM) and uses LVM for the root partition. Therefore,\u00a0 nested logical volume management is used.<\/p>\n<p>The laptop was connected to a 100Mbps network uisng a wired connection. All VMs were configured to use a bridging setup. For the network tests, a different linux server was used on the local network for running the netperf server.<\/p>\n<p>Before each test is executed, the caches are flushed on the host (and on the guest if a guest is involved). This is done using:<br \/>\n<code><br \/>\nsync<br \/>\necho 3 &gt; \/proc\/sys\/vm\/drop_caches<br \/>\n<\/code><br \/>\nThis makes the tests more independent because it eliminates any reuse of caches from a previous test.<\/p>\n<p><!--nextpage--><\/p>\n<p><strong>Disk Performance<\/strong><\/p>\n<p>A disk performance test is done using bonnie++ (1.03d-6.1) passing it the option -r 4096 (which is the actual (available) RAM size of host and guest.<\/p>\n<p>The chart below shows write performance for character based writes, block based writes, and random access writes. The scheduler for host and guest are given in the form &#8216;&lt;host&gt;-&lt;guest&gt;&#8217;. For example, deadline-cfq denotes\u00a0 that the host uses the deadline scheduler and the guest uses the cfq scheduler.<\/p>\n<script type='text\/javascript' src='https:\/\/brakkee.org\/site\/wp-content\/plugins\/easy-chart-builder\/js\/easy-chart-builder.js'><\/script>\n\n<div id=\"easyChartDiv480c8001\" style='width:100%;'  style='text-align:center;' align='center'>\n<!-- Easy Chart Builder by dyerware -->\n<img id=\"easyChartDiv480c8001_img\" style='text-align:center;float:center;' alt='dyerware.com' title=''  align='center' border='0' \/>\n<br\/><br\/><INPUT type='button' value='Show\/Hide Table Data' onclick='wpEasyChartToggle(\"easyChartDiv480c8001_data\");' style='text-align:center;' align='center' ><br\/><div class='easyChartBuilder' id=\"easyChartDiv480c8001_data\" style='text-align:center;display:none;' align='center'><\/div>\n<\/div>\n<script type=\"text\/javascript\">\n\/\/<![CDATA[\nwpEasyChart.wpNewChart(\"easyChartDiv480c8001\", {\"type\":\"vertbar\",\"width\":\"400\",\"height\":200,\"title\":\"Effect of IO scheduler on host and guest\\\/Writing\",\"minaxis\":\"\",\"groupnames\":\"cfq-cfq,cfq-deadline,cfq-noop,deadline-cfq,deadline-deadline,deadline-noop\",\"groupcolors\":\"0070C0,FFFF00,FF0000,00CC00,A3A3A3,007070,00FFFF,CC7000,00CC70,CC0070,7000CC,A370CC\",\"valuenames\":\"putc (KB\\\/s), write (KB\\\/s), random write (KB\\\/s)\",\"group1values\":\"78,76,31\",\"group2values\":\"76,76,29\",\"group3values\":\"72,76,31\",\"group4values\":\"79,78,31\",\"group5values\":\"79,76,33\",\"group6values\":\"82,85,30\",\"group7values\":\"0,0,0\",\"group8values\":\"0,0,0\",\"group9values\":\"0,0,0\",\"group10values\":\"0,0,0\",\"group11values\":\"0,0,0\",\"group12values\":\"0,0,0\",\"group1markers\":\"\",\"group2markers\":\"\",\"group3markers\":\"\",\"group4markers\":\"\",\"group5markers\":\"\",\"group6markers\":\"\",\"group7markers\":\"\",\"group8markers\":\"\",\"group9markers\":\"\",\"group10markers\":\"\",\"group11markers\":\"\",\"group12markers\":\"\",\"markercolor\":\"FFFF00\",\"imagealtattr\":\"dyerware.com\",\"imagetitleattr\":\"\",\"hidechartdata\":false,\"chartcolor\":\"FFFFFF\",\"chartfadecolor\":\"DDDDDD\",\"datatablecss\":\"hentry easyChartDataTable\",\"imgstyle\":\"text-align:center;float:center;\",\"watermark\":\"\",\"watermarkvert\":\"\",\"watermarkcolor\":\"A0BAE9\",\"currency\":\"\",\"currencyright\":false,\"precision\":\"\",\"grid\":false,\"axis\":\"both\"});\n\/\/]]>\n<\/script>\n<p>The results from the reading test are shown below:<\/p>\n\n<div id=\"easyChartDiv558c8002\" style='width:100%;'  style='text-align:center;' align='center'>\n<!-- Easy Chart Builder by dyerware -->\n<img id=\"easyChartDiv558c8002_img\" style='text-align:center;float:center;' alt='dyerware.com' title=''  align='center' border='0' \/>\n<br\/><br\/><INPUT type='button' value='Show\/Hide Table Data' onclick='wpEasyChartToggle(\"easyChartDiv558c8002_data\");' style='text-align:center;' align='center' ><br\/><div class='easyChartBuilder' id=\"easyChartDiv558c8002_data\" style='text-align:center;display:none;' align='center'><\/div>\n<\/div>\n<script type=\"text\/javascript\">\n\/\/<![CDATA[\nwpEasyChart.wpNewChart(\"easyChartDiv558c8002\", {\"type\":\"vertbar\",\"width\":\"400\",\"height\":200,\"title\":\"Effect of IO scheduler on host and guest\\\/Reading\",\"minaxis\":\"\",\"groupnames\":\"cfq-cfq,cfq-deadline,cfq-noop,deadline-cfq,deadline-deadline,deadline-noop\",\"groupcolors\":\"0070C0,FFFF00,FF0000,00CC00,A3A3A3,007070,00FFFF,CC7000,00CC70,CC0070,7000CC,A370CC\",\"valuenames\":\"getc (KB\\\/s), read (KB\\\/s), random seek (KB\\\/s)\",\"group1values\":\"72,92,0.3\",\"group2values\":\"75,92,0.3\",\"group3values\":\"75,93,0.3\",\"group4values\":\"72,92,0.3\",\"group5values\":\"68,95,0.3\",\"group6values\":\"79,93,0.3\",\"group7values\":\"0,0,0\",\"group8values\":\"0,0,0\",\"group9values\":\"0,0,0\",\"group10values\":\"0,0,0\",\"group11values\":\"0,0,0\",\"group12values\":\"0,0,0\",\"group1markers\":\"\",\"group2markers\":\"\",\"group3markers\":\"\",\"group4markers\":\"\",\"group5markers\":\"\",\"group6markers\":\"\",\"group7markers\":\"\",\"group8markers\":\"\",\"group9markers\":\"\",\"group10markers\":\"\",\"group11markers\":\"\",\"group12markers\":\"\",\"markercolor\":\"FFFF00\",\"imagealtattr\":\"dyerware.com\",\"imagetitleattr\":\"\",\"hidechartdata\":false,\"chartcolor\":\"FFFFFF\",\"chartfadecolor\":\"DDDDDD\",\"datatablecss\":\"hentry easyChartDataTable\",\"imgstyle\":\"text-align:center;float:center;\",\"watermark\":\"\",\"watermarkvert\":\"\",\"watermarkcolor\":\"A0BAE9\",\"currency\":\"\",\"currencyright\":false,\"precision\":\"\",\"grid\":false,\"axis\":\"both\"});\n\/\/]]>\n<\/script>\n<p>From the tests it is clear that the scheduler does not have such a significant effect on performance, although the combination deadline-noop seems to have a slight advantage. The results for random seeks are also comparable for all configurations and are around 0.3KB\/s.<\/p>\n<p>Next we take a look at the effect of paraviritualized drivers (virtio), and the use of both IDE and SCSI emulation. Also, the performance of the host is measured.<\/p>\n\n<div id=\"easyChartDiv1fab8003\" style='width:100%;'  style='text-align:center;' align='center'>\n<!-- Easy Chart Builder by dyerware -->\n<img id=\"easyChartDiv1fab8003_img\" style='text-align:center;float:center;' alt='dyerware.com' title=''  align='center' border='0' \/>\n<br\/><br\/><INPUT type='button' value='Show\/Hide Table Data' onclick='wpEasyChartToggle(\"easyChartDiv1fab8003_data\");' style='text-align:center;' align='center' ><br\/><div class='easyChartBuilder' id=\"easyChartDiv1fab8003_data\" style='text-align:center;display:none;' align='center'><\/div>\n<\/div>\n<script type=\"text\/javascript\">\n\/\/<![CDATA[\nwpEasyChart.wpNewChart(\"easyChartDiv1fab8003\", {\"type\":\"vertbar\",\"width\":\"400\",\"height\":200,\"title\":\"Native versus para-virtualized (virtio) versus emulated disk IO\",\"minaxis\":\"\",\"groupnames\":\"host,virtio (deadline-cfq), ide emulation,scsi emulation\",\"groupcolors\":\"0070C0,FFFF00,FF0000,00CC00,A3A3A3,007070,00FFFF,CC7000,00CC70,CC0070,7000CC,A370CC\",\"valuenames\":\"putc (KB\\\/s), write (KB\\\/s), random write (KB\\\/s), getc (KB\\\/s), read (KB\\\/s), random seek (KB\\\/s)\",\"group1values\":\"69,71,22,49,69,0.2\",\"group2values\":\"82,85,30,79,93,0.3\",\"group3values\":\"24,21,15,58,83,0.2\",\"group4values\":\"9,9,10,67,85,0.2\",\"group5values\":\"0,0,0\",\"group6values\":\"0,0,0\",\"group7values\":\"0,0,0\",\"group8values\":\"0,0,0\",\"group9values\":\"0,0,0\",\"group10values\":\"0,0,0\",\"group11values\":\"0,0,0\",\"group12values\":\"0,0,0\",\"group1markers\":\"\",\"group2markers\":\"\",\"group3markers\":\"\",\"group4markers\":\"\",\"group5markers\":\"\",\"group6markers\":\"\",\"group7markers\":\"\",\"group8markers\":\"\",\"group9markers\":\"\",\"group10markers\":\"\",\"group11markers\":\"\",\"group12markers\":\"\",\"markercolor\":\"FFFF00\",\"imagealtattr\":\"dyerware.com\",\"imagetitleattr\":\"\",\"hidechartdata\":false,\"chartcolor\":\"FFFFFF\",\"chartfadecolor\":\"DDDDDD\",\"datatablecss\":\"hentry easyChartDataTable\",\"imgstyle\":\"text-align:center;float:center;\",\"watermark\":\"\",\"watermarkvert\":\"\",\"watermarkcolor\":\"A0BAE9\",\"currency\":\"\",\"currencyright\":false,\"precision\":\"\",\"grid\":false,\"axis\":\"both\"});\n\/\/]]>\n<\/script>\n<p>These results show that the effects of disk emulation are really big, especially when it comes to write performance. In addition, SCSI emulation performs significantly worse than IDE emulation for writes. What is also striking is that the host appears to perform slightly worse than the guest using virtio. One explanation for this is that the host performs caching for the guest so that additional optimizations can take place that are unavailable on the host. This is in effect comparable to giving the guest some extra memory for caching. A similar effect can be seen by reducing the ram size used by bonnie++ (-r option) further which will also show an increased performance.<\/p>\n<p><!--nextpage--><\/p>\n<p><strong>Network Performance<\/strong><\/p>\n<p><a href=\"http:\/\/www.netperf.org\/netperf\/\">Netperf<\/a> is used for testing network performance. The chart below shows the results forTCP stream performance.<\/p>\n\n<div id=\"easyChartDiv604c8004\" style='width:100%;'  style='text-align:center;' align='center'>\n<!-- Easy Chart Builder by dyerware -->\n<img id=\"easyChartDiv604c8004_img\" style='text-align:center;float:center;' alt='dyerware.com' title=''  align='center' border='0' \/>\n<br\/><br\/><INPUT type='button' value='Show\/Hide Table Data' onclick='wpEasyChartToggle(\"easyChartDiv604c8004_data\");' style='text-align:center;' align='center' ><br\/><div class='easyChartBuilder' id=\"easyChartDiv604c8004_data\" style='text-align:center;display:none;' align='center'><\/div>\n<\/div>\n<script type=\"text\/javascript\">\n\/\/<![CDATA[\nwpEasyChart.wpNewChart(\"easyChartDiv604c8004\", {\"type\":\"vertbar\",\"width\":200,\"height\":200,\"title\":\"Network Performance\\\/TCP Stream\",\"minaxis\":\"\",\"groupnames\":\"host,virtio, emulated (RTL8139)\",\"groupcolors\":\"0070C0,FFFF00,FF0000,00CC00,A3A3A3,007070,00FFFF,CC7000,00CC70,CC0070,7000CC,A370CC\",\"valuenames\":\"TCP (KB\\\/s)\",\"group1values\":\"93.94\",\"group2values\":\"93.74\",\"group3values\":\"93.93\",\"group4values\":\"0,0,0\",\"group5values\":\"0,0,0\",\"group6values\":\"0,0,0\",\"group7values\":\"0,0,0\",\"group8values\":\"0,0,0\",\"group9values\":\"0,0,0\",\"group10values\":\"0,0,0\",\"group11values\":\"0,0,0\",\"group12values\":\"0,0,0\",\"group1markers\":\"\",\"group2markers\":\"\",\"group3markers\":\"\",\"group4markers\":\"\",\"group5markers\":\"\",\"group6markers\":\"\",\"group7markers\":\"\",\"group8markers\":\"\",\"group9markers\":\"\",\"group10markers\":\"\",\"group11markers\":\"\",\"group12markers\":\"\",\"markercolor\":\"FFFF00\",\"imagealtattr\":\"dyerware.com\",\"imagetitleattr\":\"\",\"hidechartdata\":false,\"chartcolor\":\"FFFFFF\",\"chartfadecolor\":\"DDDDDD\",\"datatablecss\":\"hentry easyChartDataTable\",\"imgstyle\":\"text-align:center;float:center;\",\"watermark\":\"\",\"watermarkvert\":\"\",\"watermarkcolor\":\"A0BAE9\",\"currency\":\"\",\"currencyright\":false,\"precision\":\"\",\"grid\":false,\"axis\":\"both\"});\n\/\/]]>\n<\/script>\n<p>What is apparent here is that the performance differences in the 100Mbps setup are negligible. From googling around, it seems to be more challenging to achieve similar results with 1Gbps and 10Gbps networks so the results can be different there, but I don&#8217;t have the network to test this.<\/p>\n<p>Up next is a request\/response test measuring how many requests\/responses can be handled per second.<\/p>\n\n<div id=\"easyChartDiv22058005\" style='width:100%;'  style='text-align:center;' align='center'>\n<!-- Easy Chart Builder by dyerware -->\n<img id=\"easyChartDiv22058005_img\" style='text-align:center;float:center;' alt='dyerware.com' title=''  align='center' border='0' \/>\n<br\/><br\/><INPUT type='button' value='Show\/Hide Table Data' onclick='wpEasyChartToggle(\"easyChartDiv22058005_data\");' style='text-align:center;' align='center' ><br\/><div class='easyChartBuilder' id=\"easyChartDiv22058005_data\" style='text-align:center;display:none;' align='center'><\/div>\n<\/div>\n<script type=\"text\/javascript\">\n\/\/<![CDATA[\nwpEasyChart.wpNewChart(\"easyChartDiv22058005\", {\"type\":\"vertbar\",\"width\":200,\"height\":200,\"title\":\"Network Performance\\\/Connections\",\"minaxis\":\"\",\"groupnames\":\"host,virtio, emulated (RTL8139)\",\"groupcolors\":\"0070C0,FFFF00,FF0000,00CC00,A3A3A3,007070,00FFFF,CC7000,00CC70,CC0070,7000CC,A370CC\",\"valuenames\":\"TCP (1\\\/s), UDP (1\\\/s)\",\"group1values\":\"987,999\",\"group2values\":\"996,998\",\"group3values\":\"990,999\",\"group4values\":\"0,0,0\",\"group5values\":\"0,0,0\",\"group6values\":\"0,0,0\",\"group7values\":\"0,0,0\",\"group8values\":\"0,0,0\",\"group9values\":\"0,0,0\",\"group10values\":\"0,0,0\",\"group11values\":\"0,0,0\",\"group12values\":\"0,0,0\",\"group1markers\":\"\",\"group2markers\":\"\",\"group3markers\":\"\",\"group4markers\":\"\",\"group5markers\":\"\",\"group6markers\":\"\",\"group7markers\":\"\",\"group8markers\":\"\",\"group9markers\":\"\",\"group10markers\":\"\",\"group11markers\":\"\",\"group12markers\":\"\",\"markercolor\":\"FFFF00\",\"imagealtattr\":\"dyerware.com\",\"imagetitleattr\":\"\",\"hidechartdata\":false,\"chartcolor\":\"FFFFFF\",\"chartfadecolor\":\"DDDDDD\",\"datatablecss\":\"hentry easyChartDataTable\",\"imgstyle\":\"text-align:center;float:center;\",\"watermark\":\"\",\"watermarkvert\":\"\",\"watermarkcolor\":\"A0BAE9\",\"currency\":\"\",\"currencyright\":false,\"precision\":\"\",\"grid\":false,\"axis\":\"both\"});\n\/\/]]>\n<\/script>\n<p>Again the results show that on my 100Mbps network there is practically no difference between the virtio network driver and an emulated RTL8139 driver.<\/p>\n<p><!--nextpage--><strong>Other Benchmarks<\/strong><\/p>\n<p>Finally we compare the result for the standard <a href=\"http:\/\/code.google.com\/p\/byte-unixbench\/\">unixbench<\/a> benchmark in different setups.<\/p>\n\n<div id=\"easyChartDiv9388006\" style='width:100%;'  style='text-align:center;' align='center'>\n<!-- Easy Chart Builder by dyerware -->\n<img id=\"easyChartDiv9388006_img\" style='text-align:center;float:center;' alt='dyerware.com' title=''  align='center' border='0' \/>\n<br\/><br\/><INPUT type='button' value='Show\/Hide Table Data' onclick='wpEasyChartToggle(\"easyChartDiv9388006_data\");' style='text-align:center;' align='center' ><br\/><div class='easyChartBuilder' id=\"easyChartDiv9388006_data\" style='text-align:center;display:none;' align='center'><\/div>\n<\/div>\n<script type=\"text\/javascript\">\n\/\/<![CDATA[\nwpEasyChart.wpNewChart(\"easyChartDiv9388006\", {\"type\":\"vertbar\",\"width\":\"400\",\"height\":200,\"title\":\"Unixbench Scores\",\"minaxis\":\"\",\"groupnames\":\"host,guest, guest without hugepages,guest with IDE emulation\",\"groupcolors\":\"0070C0,FFFF00,FF0000,00CC00,A3A3A3,007070,00FFFF,CC7000,00CC70,CC0070,7000CC,A370CC\",\"valuenames\":\"Unixbench score\",\"group1values\":\"4143\",\"group2values\":\"2912\",\"group3values\":\"2631\",\"group4values\":\"2853\",\"group5values\":\"0,0,0\",\"group6values\":\"0,0,0\",\"group7values\":\"0,0,0\",\"group8values\":\"0,0,0\",\"group9values\":\"0,0,0\",\"group10values\":\"0,0,0\",\"group11values\":\"0,0,0\",\"group12values\":\"0,0,0\",\"group1markers\":\"\",\"group2markers\":\"\",\"group3markers\":\"\",\"group4markers\":\"\",\"group5markers\":\"\",\"group6markers\":\"\",\"group7markers\":\"\",\"group8markers\":\"\",\"group9markers\":\"\",\"group10markers\":\"\",\"group11markers\":\"\",\"group12markers\":\"\",\"markercolor\":\"FFFF00\",\"imagealtattr\":\"dyerware.com\",\"imagetitleattr\":\"\",\"hidechartdata\":false,\"chartcolor\":\"FFFFFF\",\"chartfadecolor\":\"DDDDDD\",\"datatablecss\":\"hentry easyChartDataTable\",\"imgstyle\":\"text-align:center;float:center;\",\"watermark\":\"\",\"watermarkvert\":\"\",\"watermarkcolor\":\"A0BAE9\",\"currency\":\"\",\"currencyright\":false,\"precision\":\"\",\"grid\":false,\"axis\":\"both\"});\n\/\/]]>\n<\/script>\n<p>In this case, the host is a clear winner with the guest with hugepages support and virtio driver coming in second. Looking at the details of the unixbench results shows that the host is a clear winner when it comes to the various file copying tests.This is in contrast with the bonnie++ tests where the host wasn&#8217;t even faster than the guests.<\/p>\n<p>The effect of disabling hugepages for the guest is approximately 10% in this benchmark. Surprisingly, the IDE disk emulation also comes relatively close in this benchmark.<\/p>\n<p>Next up is a kernel compilation with 8 threads. I am using 8 threads here to make sure that the processor is completely busy and because there is a slight performance increase compared to using only 4 threads on my quad core hyperthreading CPU.<\/p>\n\n<div id=\"easyChartDiv28cf8007\" style='width:100%;'  style='text-align:center;' align='center'>\n<!-- Easy Chart Builder by dyerware -->\n<img id=\"easyChartDiv28cf8007_img\" style='text-align:center;float:center;' alt='dyerware.com' title=''  align='center' border='0' \/>\n<br\/><br\/><INPUT type='button' value='Show\/Hide Table Data' onclick='wpEasyChartToggle(\"easyChartDiv28cf8007_data\");' style='text-align:center;' align='center' ><br\/><div class='easyChartBuilder' id=\"easyChartDiv28cf8007_data\" style='text-align:center;display:none;' align='center'><\/div>\n<\/div>\n<script type=\"text\/javascript\">\n\/\/<![CDATA[\nwpEasyChart.wpNewChart(\"easyChartDiv28cf8007\", {\"type\":\"vertbar\",\"width\":\"400\",\"height\":200,\"title\":\"Kernel Compilation (lower is better)\",\"minaxis\":\"\",\"groupnames\":\"host,guest, guest without hugepages,guest with IDE emulation\",\"groupcolors\":\"0070C0,FFFF00,FF0000,00CC00,A3A3A3,007070,00FFFF,CC7000,00CC70,CC0070,7000CC,A370CC\",\"valuenames\":\"Kernel Compilation (s)\",\"group1values\":\"1453\",\"group2values\":\"1276\",\"group3values\":\"1366\",\"group4values\":\"1446\",\"group5values\":\"0,0,0\",\"group6values\":\"0,0,0\",\"group7values\":\"0,0,0\",\"group8values\":\"0,0,0\",\"group9values\":\"0,0,0\",\"group10values\":\"0,0,0\",\"group11values\":\"0,0,0\",\"group12values\":\"0,0,0\",\"group1markers\":\"\",\"group2markers\":\"\",\"group3markers\":\"\",\"group4markers\":\"\",\"group5markers\":\"\",\"group6markers\":\"\",\"group7markers\":\"\",\"group8markers\":\"\",\"group9markers\":\"\",\"group10markers\":\"\",\"group11markers\":\"\",\"group12markers\":\"\",\"markercolor\":\"FFFF00\",\"imagealtattr\":\"dyerware.com\",\"imagetitleattr\":\"\",\"hidechartdata\":false,\"chartcolor\":\"FFFFFF\",\"chartfadecolor\":\"DDDDDD\",\"datatablecss\":\"hentry easyChartDataTable\",\"imgstyle\":\"text-align:center;float:center;\",\"watermark\":\"\",\"watermarkvert\":\"\",\"watermarkcolor\":\"A0BAE9\",\"currency\":\"\",\"currencyright\":false,\"precision\":\"\",\"grid\":false,\"axis\":\"both\"});\n\/\/]]>\n<\/script>\n<p>In this case the host loses. This is probably due again to the additional caching provided by the host for each guest which effectively speeds up the guests. The guest with hugepages is in this case again the fastests of all guest configuration and IDE emulation is again not that bad compared to the bonnie++ benachmarks. Apparently, write performance is not a dominating factor for a kernel compile. In this test also, hugepages provide a performance benefit (in this case 7%).<strong><!--nextpage--><\/strong><\/p>\n<p><strong>Conclusions<\/strong><\/p>\n<p>As with all benchmarking one has to be really careful in interpreting the results and there are indeed many objections one can have with the current setup. One of them is for instance that I considered running only one VM at a time. Another is that I did not measure the load on the host while testing a VM.\u00a0 Also, the tests are done on a laptop. Results could vary on different hardware. Nevertheless, the purpose of this benchmarking was just to get some feeling for the effects of various tuning parameters.<\/p>\n<p>The conclusions that I am drawing from this are:<\/p>\n<ul>\n<li>The differences between the performance with various IO schedulers on host and guest are not that significant. There is however a slight tendency for deadline on the host and noop on the guest to be the best combination. This seems to be inline with what some manufacturers also recommend, see for instance <a href=\"http:\/\/www.redhat.com\/f\/pdf\/rhel\/HP-DL785-KVM-scaling-v1.pdf\">here<\/a><\/li>\n<li>The difference in network performance between paravirtualized drivers (virtio) and emulated drivers is negligible on a 100Mbps network. Also the performance of the guests is practically identical to that of the host.<\/li>\n<li>Hugepages on the host can result in a small speedup of a guest.<\/li>\n<li>Additional caching by the host probably helps the performance of the guests.<\/li>\n<li>Use of para-virtualized drivers for disk IO does help performance a lot compared to emulated drivers, in particular when it comes to write performance. Nevertheless, unixbench and bonnie++ seem to contradict each other when it comes to comparing disk performance of the host with that of the guest.<\/li>\n<\/ul>\n<p>Based on these benchmarking results I am going to use the following settings for KVM:<\/p>\n<ul>\n<li>Use para-virtualized disk IO drivers. This is the most essential optimization to make.<\/li>\n<li>Use hugepages on the host to be used by the guests.<\/li>\n<li>Use deadline scheduler on the host.<\/li>\n<li>Use noop scheduler on the guest.<\/li>\n<li>Use para-virtualized network drivers: Even though it does not have a performance advantage in the benchmarks it should be more efficient, so I am including this for &#8216;theoretical&#8217; reasons.<\/li>\n<\/ul>\n<p>Use of paravirtualized drivers can give problems with maiden install and upgrades because it requires special drivers. Fortunately, it is easy to just start up a given VM using emulated drivers that was previously started using paravirtulized drivers. In fact this is what I did for these tests. In any case, the choices for these tuning parameters can easily be changed later, either in the VM configuration on the host or with a simple change inside the guest.<\/p>\n<p>Finally, a great thanks to the KVM community for making such an excellent virtualization solution. In all these tests it held up fine and worked without a glitch. After this, I am completely convinced that I will use KVM on the new server.<\/p>\n<p><strong><br \/>\n<\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>General Approach Over the past week, I have been doing benchmarking on KVM performance. The reason for this is that I want to use KVM on my new server and need to know what the impact of KVM is on &hellip; <a href=\"https:\/\/brakkee.org\/site\/2010\/11\/28\/kernel-virtual-machine-kvm-benchmark-results\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[10],"tags":[],"_links":{"self":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/522"}],"collection":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/comments?post=522"}],"version-history":[{"count":0,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/posts\/522\/revisions"}],"wp:attachment":[{"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/media?parent=522"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/categories?post=522"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brakkee.org\/site\/wp-json\/wp\/v2\/tags?post=522"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}