[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Re: Luvalley project: enable Qemu to utilize hardware virtu
From: |
Xiaodong Yi |
Subject: |
[Qemu-devel] Re: Luvalley project: enable Qemu to utilize hardware virtualization extensions on arbitrary operating system |
Date: |
Fri, 17 Apr 2009 17:12:41 +0800 |
Hi,
I've tested the guest Linux using UnixBench 5.1.2. The guest Linux
means the Linux running on the virtual machine provided by Luvalley
and Qemu. The testing platform is:
* Intel's Core Due CPU with 2 cores, 2GB RAM
* CentOS 5.2 as the dom0 Linux, i.e., the host Linux for KVM
* CentOS 5.2 as the guest Linux, i.e., the Linux running on the
virtual machine provided by Qemu
The first set of results is for Luvalley, and the second one is for
KVM. As the result, Luvalley's guest Linux is 20% ~ 30% faster than
KVM's guest! It is very surprise to me. I had through Luvalley's guest
should be the same performance as KVM's.
Luvalley's result:
========================================================================
BYTE UNIX Benchmarks (Version 5.1.2)
System: localhost.localdomain: GNU/Linux
OS: GNU/Linux -- 2.6.18-53.el5 -- #1 SMP Mon Nov 12 02:22:48 EST 2007
Machine: i686 (i386)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: QEMU Virtual CPU version 0.9.1 (5326.4 bogomips)
x86-64, MMX, Physical Address Ext
CPU 1: QEMU Virtual CPU version 0.9.1 (5319.9 bogomips)
x86-64, MMX, Physical Address Ext
11:32:22 up 1 min, 1 user, load average: 2.39, 1.07, 0.39; runlevel 5
------------------------------------------------------------------------
Benchmark Run: 五 4月 17 2009 11:32:22 - 11:44:22
2 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 10802400.5 lps (10.0 s, 2 samples)
Double-Precision Whetstone 12287.7 MWIPS (10.0 s, 2 samples)
Execl Throughput 1044.6 lps (29.2 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 429860.0 KBps (30.0 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 125357.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 990103.0 KBps (30.0 s, 1 samples)
Pipe Throughput 803044.2 lps (10.0 s, 2 samples)
Pipe-based Context Switching 124785.9 lps (10.0 s, 2 samples)
Process Creation 1861.8 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 2338.6 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 438.3 lpm (60.1 s, 1 samples)
System Call Overhead 709335.4 lps (10.0 s, 2 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 10802400.5 925.7
Double-Precision Whetstone 55.0 12287.7 2234.1
Execl Throughput 43.0 1044.6 242.9
File Copy 1024 bufsize 2000 maxblocks 3960.0 429860.0 1085.5
File Copy 256 bufsize 500 maxblocks 1655.0 125357.0 757.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 990103.0 1707.1
Pipe Throughput 12440.0 803044.2 645.5
Pipe-based Context Switching 4000.0 124785.9 312.0
Process Creation 126.0 1861.8 147.8
Shell Scripts (1 concurrent) 42.4 2338.6 551.6
Shell Scripts (8 concurrent) 6.0 438.3 730.5
System Call Overhead 15000.0 709335.4 472.9
========
System Benchmarks Index Score 631.2
------------------------------------------------------------------------
Benchmark Run: 五 4月 17 2009 11:44:22 - 11:56:05
2 CPUs in system; running 2 parallel copies of tests
Dhrystone 2 using register variables 22031241.9 lps (10.0 s, 2 samples)
Double-Precision Whetstone 23862.7 MWIPS (10.0 s, 2 samples)
Execl Throughput 1691.9 lps (29.8 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 153766.0 KBps (30.1 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 45848.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 432211.0 KBps (30.0 s, 1 samples)
Pipe Throughput 1617636.7 lps (10.0 s, 2 samples)
Pipe-based Context Switching 233890.1 lps (10.0 s, 2 samples)
Process Creation 3207.9 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 3151.4 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 437.8 lpm (60.2 s, 1 samples)
System Call Overhead 1386223.1 lps (10.0 s, 2 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 22031241.9 1887.9
Double-Precision Whetstone 55.0 23862.7 4338.7
Execl Throughput 43.0 1691.9 393.5
File Copy 1024 bufsize 2000 maxblocks 3960.0 153766.0 388.3
File Copy 256 bufsize 500 maxblocks 1655.0 45848.0 277.0
File Copy 4096 bufsize 8000 maxblocks 5800.0 432211.0 745.2
Pipe Throughput 12440.0 1617636.7 1300.4
Pipe-based Context Switching 4000.0 233890.1 584.7
Process Creation 126.0 3207.9 254.6
Shell Scripts (1 concurrent) 42.4 3151.4 743.3
Shell Scripts (8 concurrent) 6.0 437.8 729.7
System Call Overhead 15000.0 1386223.1 924.1
========
System Benchmarks Index Score 735.5
KVM's results:
========================================================================
BYTE UNIX Benchmarks (Version 5.1.2)
System: localhost.localdomain: GNU/Linux
OS: GNU/Linux -- 2.6.18-53.el5 -- #1 SMP Mon Nov 12 02:22:48 EST 2007
Machine: i686 (i386)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: QEMU Virtual CPU version 0.9.1 (5325.7 bogomips)
x86-64, MMX, Physical Address Ext
CPU 1: QEMU Virtual CPU version 0.9.1 (5319.6 bogomips)
x86-64, MMX, Physical Address Ext
12:02:30 up 1 min, 1 user, load average: 2.37, 0.87, 0.31; runlevel 5
------------------------------------------------------------------------
Benchmark Run: 五 4月 17 2009 12:02:30 - 12:11:33
2 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 10599139.8 lps (10.0 s, 2 samples)
Double-Precision Whetstone 2166.3 MWIPS (10.2 s, 2 samples)
Execl Throughput 598.3 lps (29.9 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 458264.0 KBps (30.0 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 125402.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 1122309.0 KBps (30.0 s, 1 samples)
Pipe Throughput 811955.6 lps (10.0 s, 2 samples)
Pipe-based Context Switching 116759.0 lps (10.0 s, 2 samples)
Process Creation 1503.8 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 1942.2 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 374.9 lpm (60.0 s, 1 samples)
System Call Overhead 712668.8 lps (10.0 s, 2 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 10599139.8 908.2
Double-Precision Whetstone 55.0 2166.3 393.9
Execl Throughput 43.0 598.3 139.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 458264.0 1157.2
File Copy 256 bufsize 500 maxblocks 1655.0 125402.0 757.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 1122309.0 1935.0
Pipe Throughput 12440.0 811955.6 652.7
Pipe-based Context Switching 4000.0 116759.0 291.9
Process Creation 126.0 1503.8 119.4
Shell Scripts (1 concurrent) 42.4 1942.2 458.1
Shell Scripts (8 concurrent) 6.0 374.9 624.9
System Call Overhead 15000.0 712668.8 475.1
========
System Benchmarks Index Score 502.8
------------------------------------------------------------------------
Benchmark Run: 五 4月 17 2009 12:11:33 - 12:20:43
2 CPUs in system; running 2 parallel copies of tests
Dhrystone 2 using register variables 21416721.5 lps (10.0 s, 2 samples)
Double-Precision Whetstone 4928.6 MWIPS (10.1 s, 2 samples)
Execl Throughput 1438.2 lps (29.6 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 122731.0 KBps (30.1 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 32222.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 308986.0 KBps (30.0 s, 1 samples)
Pipe Throughput 1617083.9 lps (10.0 s, 2 samples)
Pipe-based Context Switching 230390.0 lps (10.0 s, 2 samples)
Process Creation 2373.6 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 2732.8 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 373.4 lpm (60.1 s, 1 samples)
System Call Overhead 1393848.5 lps (10.0 s, 2 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 21416721.5 1835.2
Double-Precision Whetstone 55.0 4928.6 896.1
Execl Throughput 43.0 1438.2 334.5
File Copy 1024 bufsize 2000 maxblocks 3960.0 122731.0 309.9
File Copy 256 bufsize 500 maxblocks 1655.0 32222.0 194.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 308986.0 532.7
Pipe Throughput 12440.0 1617083.9 1299.9
Pipe-based Context Switching 4000.0 230390.0 576.0
Process Creation 126.0 2373.6 188.4
Shell Scripts (1 concurrent) 42.4 2732.8 644.5
Shell Scripts (8 concurrent) 6.0 373.4 622.4
System Call Overhead 15000.0 1393848.5 929.2
========
System Benchmarks Index Score 558.9
Thanks for your attention and welcome further feedback.
Regards,
Xiaodong Yi
2009/3/26 Xiaodong Yi <address@hidden>:
> Luvalley is a Virtual Machine Monitor (VMM) spawned from the KVM
> project. Its part of source codes are derived from KVM to virtualize
> CPU instructions and memory management unit (MMU). However, its
> overall architecture is completely different from KVM, but somewhat
> like Xen. Luvalley runs outside of Linux, just like Xen's
> architecture, but it still uses Linux as its scheduler, memory
> manager, physical device driver provider and virtual IO device
> emulator. Moreover, Luvalley may run WITHOUT Linux. In theory, any
> operating system could take the place of Linux to provide the above
> services. Currently, Luvalley supports Linux and Windows. That is to
> say, one may run Luvalley to boot a Linux or Windows, and then run
> multiple virtualized operating systems on such Linux or Windows.
>
> In KVM, Qemu is adopted as the IO device emulator. From the point of
> view of Qemu, KVM enables Qemu to utilize hardware virtualization
> extensions such as Intel's VT on Linux. As for Luvalley, Qemu is also
> adopted as its IO device emulator. However, Luvalley could enable Qemu
> to utilize hardware virtualization extensions on ANY operating system.
>
> If you are interested in Luvalley project, you may download Luvalley's
> source codes from
> http://sourceforge.net/projects/luvalley/
>
> The following is more details about Luvalley.
>
> Luvalley is an external hypervisor, just like Xen
> (http://www.xen.org). It boots and controls the X86 machine before
> starting up any operating system. However, Luvalley is much smaller
> and simpler than Xen. Most jobs of Xen, such as scheduling, memory
> management, interrupt management, etc, are shifted to Linux (or any
> other OS), which is running on top of Luvalley.
>
> Luvalley gets booted first when the X86 machine is power on. It boots
> up all CPUs in SMP system and enables their virtualization extensions.
> Then the MBR (Master Boot Record) is read out and executed in CPU's
> virtualization mode. Following this way, a Linux (or any other OS)
> will be booted up at last. Luvalley assigns all physical memory,
> programmable interrupt controller (PIC) and IO devices to this
> priviledged OS. Following Xen, we call this OS as "domain 0" (dom0)
> OS.
>
> Like KVM, a modified Qemu is running on dom0 Linux to provide virtual
> IO devices for other operating systems running on top of Luvalley. We
> also follow Xen to call these operating systems "domain user" (domU).
> That is to say, there must be exact one dom0 OS and may be several
> domU OSs running on top of Luvalley. Each domU OS corresponds to a
> Qemu process in dom0 OS. The memory of domU is allocated from dom0 by
> Qemu. And when Qemu is scheduled to run by dom0 Scheduler, it will
> call Luvalley to run the corresponding domU.
>
> Moreover, as Luvalley requires nothing from the dom0 Linux kernel,
> other operating systems such as Windows, FreeBSD, etc can also serve
> as dom0 OS, provided that Qemu can be ported to these operating
> systems. Since Qemu is an userland application and is able to cross
> platform, such porting is feasible. Currently, we have added the
> Luvalley support into Qemu-0.10.0, which can be compiled and run in
> Windows. With the help of Luvalley, Qemu-0.10.0 runs much faster
> becuase it could utilize the VT support provided by Intel CPU.
>
> In summary, Luvalley inherited all merits from KVM. Especially,
> Luvalley is very small and simple. It is even more easy-to-use than
> KVM because it does not depend on specific Linux kernel version. Every
> version of Linux can serve as Luvalley's dom0 OS, except that Qemu
> cannot run on it.
>
> In addition, we think Luvalley's architecture meets the demand on both
> desktop and server operating system area:
>
> 1. In the desktop area, there are many kinds of operating systems
> runing on various hardwares and devices. In theory, it is rather easy
> to add virtualization ability for all kinds of operating systems,
> without sacrificing the hardware compatibility and the user
> experience. Moreover, Luvalley is very easy to install. It requires
> only a boot loader which supports Multiboot Specification, e.g., Grub,
> WinGrub (http://sourceforge.net/projects/grub4dos), etc.
>
> 2. In the server area, especially for large-scale server systems (for
> example, throusands of CPUs), a single Linux is not suitable to manage
> the whole system. Therefore, KVM cannot be used properly. Luvalley's
> architecture is more suitable for servers. For example, it can be used
> to divide physical resources to partitions, and run a Linux for each
> partition. In addition, Luvalley is very small and may be put into
> BIOS to serve as a virtulization firmware.
>
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Qemu-devel] Re: Luvalley project: enable Qemu to utilize hardware virtualization extensions on arbitrary operating system,
Xiaodong Yi <=