qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Slow kernel/initrd loading via fw_cfg; Was Re: Hack int


From: Alexander Graf
Subject: Re: [Qemu-devel] Slow kernel/initrd loading via fw_cfg; Was Re: Hack integrating SeaBios / LinuxBoot option rom with QEMU trace backends
Date: Tue, 11 Oct 2011 15:14:52 +0200

On 11.10.2011, at 15:12, Anthony Liguori wrote:

> On 10/11/2011 04:38 AM, Alexander Graf wrote:
>> 
>> On 11.10.2011, at 11:26, Avi Kivity wrote:
>> 
>>> On 10/11/2011 11:19 AM, Alexander Graf wrote:
>>>>>> 
>>>>>>  Of this, 1.4 seconds is the time required by LinuxBoot to copy the
>>>>>>  kernel+initrd. If I used an uncompressed initrd, which I really want
>>>>>>  to, to avoid decompression overhead, this increases to ~1.7 seconds.
>>>>>>  So the LinuxBoot ROM is ~60% of total QEMU execution time, or 40%
>>>>>>  of total sandbox execution overhead.
>>>>> 
>>>>>  One thing we can do is boot a guest and immediately snapshot it, before 
>>>>> it runs any application specific code.  Subsequent invocations will 
>>>>> MAP_PRIVATE the memory image and COW their way.  This avoids the kernel 
>>>>> initialization time as well.
>>>> 
>>>> That doesn't allow modification of -append
>>> 
>>> Is it really needed?
>> 
>> For our use case for example yes. We pass the cifs user/pass using the 
>> kernel cmdline, so we can reuse existing initrd code and just mount it as 
>> root.
>> 
>>> 
>>>> and gets you in a pretty bizarre state when doing updates of your host 
>>>> files, since then you have 2 different paths: full boot and restore. 
>>>> That's yet another potential source for bugs.
>>> 
>>> Typically you'd check the timestamps to make sure you're running an 
>>> up-to-date version.
>> 
>> Yes. That's why I said you end up with 2 different boot cases. Now imagine 
>> you get a bug once every 10000 bootups and try to trace that down that it 
>> only happens when running in the non-resume case.
>> 
>>> 
>>>> 
>>>>> 
>>>>>> 
>>>>>>  For comparison I also did a test building a bootable ISO using ISOLinux.
>>>>>>  This required 700 ms for the boot time, which is appoximately 1/2 the
>>>>>>  time reqiured for direct kernel/initrd boot. But you have to then add
>>>>>>  on time required to build the ISO on every boot, to add custom kernel
>>>>>>  command line args. So while ISO is faster than LinuxBoot currently
>>>>>>  there is still non-negligable overhead here that I want to avoid.
>>>>> 
>>>>>  You can accept parameters from virtio-serial or some other channel.  Is 
>>>>> there any reason you need them specifically as *kernel* command line 
>>>>> parameters?
>>>> 
>>>> That doesn't work for kernel parameters. It also means things would have 
>>>> to be rewritten needlessly. Some times we can't easily change the way 
>>>> parameters are passed into the guest either, for example when running a 
>>>> random (read: old, think of RHEL5) distro installation initrd.
>>> 
>>> This use case is not installation, it's for app sandboxing.
>> 
>> I thought we were talking about plenty different use cases here? I'm pretty 
>> sure there are even more out there that we haven't even thought about.
>> 
>>> 
>>>> And I don't see the point why we would have to shoot yet another hole into 
>>>> the guest just because we're too unwilling to make an interface that's 
>>>> perfectly valid horribly slow.
>>> 
>>> rep/ins is exactly like dma+wait for this use case: provide an address, get 
>>> a memory image in return.  There's no need to add another interface, we 
>>> should just optimize the existing one.
>> 
>> Whatever we do, the interface will never be as fast as DMA. We will always 
>> have to do sanity / permission checks for every IO operation, can batch up 
>> only so many IO requests and in QEMU again have to call our callbacks in a 
>> loop.
> 
> rep/ins is effectively equivalent to DMA except in how it's handled within 
> QEMU.

No, DMA has a lot bigger granularities in kvm/user interaction. We can easily 
DMA a 50MB region with a single kvm/user exit. For PIO we can at most do page 
granularity.


Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]