qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type


From: Stefano Garzarella
Subject: Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type
Date: Tue, 23 Jul 2019 13:30:09 +0200
User-agent: NeoMutt/20180716

On Tue, Jul 23, 2019 at 10:47:39AM +0100, Stefan Hajnoczi wrote:
> On Tue, Jul 23, 2019 at 9:43 AM Sergio Lopez <address@hidden> wrote:
> > Montes, Julio <address@hidden> writes:
> >
> > > On Fri, 2019-07-19 at 16:09 +0100, Stefan Hajnoczi wrote:
> > >> On Fri, Jul 19, 2019 at 2:48 PM Sergio Lopez <address@hidden> wrote:
> > >> > Stefan Hajnoczi <address@hidden> writes:
> > >> > > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote:
> > >> > > > Stefan Hajnoczi <address@hidden> writes:
> > >> > > >
> > >> > > > > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote:
> > >> > > >  --------------
> > >> > > >  | Conclusion |
> > >> > > >  --------------
> > >> > > >
> > >> > > > The average boot time of microvm is a third of Q35's (115ms vs.
> > >> > > > 363ms),
> > >> > > > and is smaller on all sections (QEMU initialization, firmware
> > >> > > > overhead
> > >> > > > and kernel start-to-user).
> > >> > > >
> > >> > > > Microvm's memory tree is also visibly simpler, significantly
> > >> > > > reducing
> > >> > > > the exposed surface to the guest.
> > >> > > >
> > >> > > > While we can certainly work on making Q35 smaller, I definitely
> > >> > > > think
> > >> > > > it's better (and way safer!) having a specialized machine type
> > >> > > > for a
> > >> > > > specific use case, than a minimal Q35 whose behavior
> > >> > > > significantly
> > >> > > > diverges from a conventional Q35.
> > >> > >
> > >> > > Interesting, so not a 10x difference!  This might be amenable to
> > >> > > optimization.
> > >> > >
> > >> > > My concern with microvm is that it's so limited that few users
> > >> > > will be
> > >> > > able to benefit from the reduced attack surface and faster
> > >> > > startup time.
> > >> > > I think it's worth investigating slimming down Q35 further first.
> > >> > >
> > >> > > In terms of startup time the first step would be profiling Q35
> > >> > > kernel
> > >> > > startup to find out what's taking so long (firmware
> > >> > > initialization, PCI
> > >> > > probing, etc)?
> > >> >
> > >> > Some findings:
> > >> >
> > >> >  1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host")
> > >> > saves a
> > >> >     whooping 120ms by avoiding the APIC timer calibration at
> > >> >     arch/x86/kernel/apic/apic.c:calibrate_APIC_clock
> > >> >
> > >> > Average boot time with "-cpu host"
> > >> >  qemu_init_end: 76.408950
> > >> >  linux_start_kernel: 116.166142 (+39.757192)
> > >> >  linux_start_user: 242.954347 (+126.788205)
> > >> >
> > >> > Average boot time with default "cpu"
> > >> >  qemu_init_end: 77.467852
> > >> >  linux_start_kernel: 116.688472 (+39.22062)
> > >> >  linux_start_user: 363.033365 (+246.344893)
> > >>
> > >> \o/
> > >>
> > >> >  2. The other 130ms are a direct result of PCI and ACPI presence
> > >> > (tested
> > >> >     with a kernel without support for those elements). I'll publish
> > >> > some
> > >> >     detailed numbers next week.
> > >>
> > >> Here are the Kata Containers kernel parameters:
> > >>
> > >> var kernelParams = []Param{
> > >>         {"tsc", "reliable"},
> > >>         {"no_timer_check", ""},
> > >>         {"rcupdate.rcu_expedited", "1"},
> > >>         {"i8042.direct", "1"},
> > >>         {"i8042.dumbkbd", "1"},
> > >>         {"i8042.nopnp", "1"},
> > >>         {"i8042.noaux", "1"},
> > >>         {"noreplace-smp", ""},
> > >>         {"reboot", "k"},
> > >>         {"console", "hvc0"},
> > >>         {"console", "hvc1"},
> > >>         {"iommu", "off"},
> > >>         {"cryptomgr.notests", ""},
> > >>         {"net.ifnames", "0"},
> > >>         {"pci", "lastbus=0"},
> > >> }
> > >>
> > >> pci lastbus=0 looks interesting and so do some of the others :).
> > >>
> > >
> > > yeah, pci=lastbus=0 is very helpful to reduce the boot time in q35,
> > > kernel won't scan the 255.. buses :)
> >
> > I can confirm that adding pci=lastbus=0 makes a significant
> > improvement. In fact, is the only option from Kata's kernel parameter
> > list that has an impact, probably because the kernel is already quite
> > minimalistic.
> >
> > Average boot time with "-cpu host" and "pci=lastbus=0"
> >  qemu_init_end: 73.711569
> >  linux_start_kernel: 113.414311 (+39.702742)
> >  linux_start_user: 190.949939 (+77.535628)
> >
> > That's still ~40% slower than microvm, and the breach quickly widens
> > when adding more PCI devices (each one adds 10-15ms), but it's certainly
> > an improvement over the original numbers.
> >
> > On the other hand, there isn't much we can do here from QEMU's
> > perspective, as this is basically Guest OS tuning.
> 
> fw_cfg could expose this information so guest kernels know when to
> stop enumerating the PCI bus.  This would make all PCI guests with new
> kernels boot ~50 ms faster, regardless of machine type.
> 
> The difference between microvm and tuned Q35 is 76 ms now.
> 
> microvm:
> qemu_init_end: 64.043264
> linux_start_kernel: 65.481782 (+1.438518)
> linux_start_user: 114.938353 (+49.456571)
> 
> Q35 with -cpu host and pci=lasbus=0:
> qemu_init_end: 73.711569
> linux_start_kernel: 113.414311 (+39.702742)
> linux_start_user: 190.949939 (+77.535628)
> 
> There is a ~39 ms difference before linux_start_kernel.  SeaBIOS is
> loading the PVH Option ROM.
> 
> Stefano: any recommendations for profiling or tuning SeaBIOS?

As I said on IRC, the SeaBIOS image in QEMU is the 1.12.1 and it doesn't
include this patch (available in the upstream SeaBIOS) that saves ~10ms:

    commit 75b42835134553c96f113e5014072c0caf99d092
    Author: Stefano Garzarella <address@hidden>
    Date:   Sun Dec 2 14:10:13 2018 +0100

        qemu: avoid debug prints if debugcon is not enabled

        In order to speed up the boot phase, we can check the QEMU
        debugcon device, and disable the writes if it is not recognized.

        This patch allow us to save around 10 msec (time measured
        between SeaBIOS entry point and "linuxboot" entry point)
        when CONFIG_DEBUG_LEVEL=1 and debugcon is not enabled.

        Signed-off-by: Stefano Garzarella <address@hidden>
        Signed-off-by: Kevin O'Connor <address@hidden>

As you said, we should update SeaBIOS for the next QEMU release.

For profiling, I have some patches that I used to put trace points in
the SeaBIOS code. I'll put them in this repository ASAP:
    https://github.com/stefano-garzarella/qemu-boot-time



reply via email to

[Prev in Thread] Current Thread [Next in Thread]