qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares


From: Stefano Garzarella
Subject: Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
Date: Mon, 10 Dec 2018 14:46:52 +0100

Hi Maran,

On Wed, Dec 5, 2018 at 7:04 PM Maran Wilson <address@hidden> wrote:
>
> On 12/5/2018 5:20 AM, Stefan Hajnoczi wrote:
> > On Tue, Dec 04, 2018 at 02:44:33PM -0800, Maran Wilson wrote:
> >> On 12/3/2018 8:35 AM, Stefano Garzarella wrote:
> >>> On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <address@hidden> wrote:
> >>>> Hi Stefano, thanks for capturing all these numbers,
> >>>>
> >>>> On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
> >>>>> Hi Rob,
> >>>>> I continued to investigate the boot time, and as you suggested I
> >>>>> looked also at qemu-lite 2.11.2
> >>>>> (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> >>>>> did the following tests using the Kata kernel configuration
> >>>>> (
> >>>>> https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
> >>>>> )
> >>>>>
> >>>>> To compare the results with qemu-lite direct kernel load, I added
> >>>>> another tracepoint:
> >>>>> - linux_start_kernel: first entry of the Linux kernel
> >>>>> (start_kernel())
> >>>>>
> >>>> Great, do you have a set of patches available that all these trace
> >>>> points. It would be great for reproduction.
> >>> For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
> >>> nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
> >>> I'm also sharing a python script that I'm using with perf to extract
> >>> the numbers in this way:
> >>>
> >>> $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
> >>> sched:sched_process_exec -o /tmp/qemu_perf.data &
> >>> $ # start qemu/nemu multiple times
> >>> $ killall perf
> >>> $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
> >>>
> >>>>> As you can see, NEMU is faster to jump to the kernel
> >>>>> (linux_start_kernel) than qemu-lite when uses qboot or seabios with
> >>>>> virt support, but the time to the user space is strangely high, maybe
> >>>>> the kernel configuration that I used is not the best one.
> >>>>> Do you suggest another kernel configuration?
> >>>>>
> >>>> This looks very bad. This isn't the kernel configuration we normally
> >>>> test with in our automated test system but is definitely one we support
> >>>> as part of our partnernship with the Kata team. It's a high priority
> >>>> for me to try and investigate that. Have you saved the kernel messages
> >>>> as they might be helpful?
> >>> Yes, I'm attaching the dmesg output with nemu and qemu.
> >>>
> >>>>> Anyway, I obtained the best boot time with qemu-lite and direct
> >>>>> kernel
> >>>>> load (vmlinux ELF image). I think because the kernel was not
> >>>>> compressed. Indeed, looking to the others test, the kernel
> >>>>> decompression (bzImage) takes about 80 ms (linux_start_kernel -
> >>>>> linux_start_boot). (I'll investigate better)
> >>>>>
> >>>> Yup being able to load an uncompressed kernel is one of the big
> >>>> advantages of qemu-lite. I wonder if we could bring that feature into
> >>>> qemu itself to supplement the existing firmware based kernel loading.
> >>> I think so, I'll try to understand if we can merge the qemu-lite
> >>> direct kernel loading in qemu.
> >> An attempt was made a long time ago to push the qemu-lite stuff (from the
> >> Intel Clear Containers project) upstream. As I understand it, the main
> >> stumbling block that seemed to derail the effort was that it involved 
> >> adding
> >> Linux OS specific code to Qemu so that Qemu could do things like create and
> >> populate the zero page that Linux expects when entering startup_64().
> >>
> >> That ends up being a lot of very low-level, operating specific knowledge
> >> about Linux that ends up getting baked into Qemu code. And understandably, 
> >> a
> >> number of folks saw problems with going down a path like that.
> >>
> >> Since then, we have put together an alternative solution that would allow
> >> Qemu to boot an uncompressed Linux binary via the x86/HVM direct boot ABI
> >> (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The solution 
> >> involves
> >> first making changes to both the ABI as well as Linux, and then updating
> >> Qemu to take advantage of the updated ABI which is already supported by 
> >> both
> >> Linux and Free BSD for booting VMs. As such, Qemu can remain OS agnostic,
> >> and just be programmed to the published ABI.
> >>
> >> The canonical definition for the HVM direct boot ABI is in the Xen tree and
> >> we needed to make some minor changes to the ABI definition to allow KVM
> >> guests to also use the same structure and entry point. Those changes were
> >> accepted to the Xen tree already:
> >> https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html
> >>
> >> The corresponding Linux changes that would allow KVM guests to be booted 
> >> via
> >> this PVH entry point have already been posted and reviewed:
> >> https://lkml.org/lkml/2018/4/16/1002
> >>
> >> The final part is the set of Qemu changes to take advantage of the above 
> >> and
> >> boot a KVM guest via an uncompressed kernel binary using the entry point
> >> defined by the ABI. Liam Merwick will be posting some RFC patches very soon
> >> to allow this.
> > Cool, thanks for doing this work!
> >
> > How do the boot times compare to qemu-lite and Firecracker's
> > (https://github.com/firecracker-microvm/firecracker/) direct vmlinux ELF
> > boot?
>
> Boot times compare very favorably to qemu-lite, since the end result is
> basically doing a very similar thing. For now, we are going with a QEMU
> + qboot solution to introduce the PVH entry support in Qemu (meaning we
> will be posting Qemu and qboot patches and you will need both to boot an
> uncompressed kernel binary). As such we have numbers that Liam will
> include in the cover letter showing significant boot time improvement
> over existing QEMU + qboot approaches involving a compressed kernel
> binary. And as we all know, the existing qboot approach already gets
> boot times down pretty low.
>
> Once the patches have been posted (soon) it would be great if some other
> folks could pick them up and run your own numbers on various test setups
> and comparisons you already have.
>
> I haven't tried Firecracker, specifically. It would be good to see a
> comparison just so we know where we stand, but it's not terribly
> relevant to folks who want to continue using Qemu right? Meaning Qemu
> (and all solutions built on it like kata) still needs a solution for
> improving boot time regardless of what NEMU and Firecracker are doing.
>
> And from what I've read so far, Firecracker only supports Linux guests.
> So one could arguably just bake in all sorts of Linux specific knowledge
> into it and have it lay things out like zero page right in the VMM code
> right?

Yes, you are right!

>
> I don't know off-hand, but is that how Firecracker boots an uncompressed
> Linux kernel? Anyone know?

I'm looking in Firecracker and they use the same approach of qemu-lite
to load the Linux kernel:
1. load ELF image (vmlinux)
2. setup zero page in VMM code (eg. command line)
3. setup VM registers  (eg. ESI = zero page address, EIP = ELF entry_point, etc)
4. start VM (ELF entry_point = phys_startup_64)

Cheers,
Stefano

>
> Thanks,
> -Maran
>
> > I'm asking because there are several custom approaches to fast kernel
> > boot and we should make sure that whatever Linux and QEMU end up
> > natively supporting is likely to work for all projects (NEMU, qemu-lite,
> > Firecracker) and operating systems (Linux distros, other OSes).
> >
> > Stefan
>


-- 
Stefano Garzarella
Red Hat



reply via email to

[Prev in Thread] Current Thread [Next in Thread]