Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up an

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up an

From:	Daniel P. Berrange
Subject:	Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up and migration time.
Date:	Mon, 13 Feb 2017 11:23:17 +0000
User-agent:	Mutt/1.7.1 (2016-10-04)

On Mon, Feb 13, 2017 at 11:45:46AM +0100, Igor Mammedov wrote:
> On Mon, 13 Feb 2017 14:30:56 +0530
> Jitendra Kolhe <address@hidden> wrote:
> 
> > Using "-mem-prealloc" option for a large guest leads to higher guest
> > start-up and migration time. This is because with "-mem-prealloc" option
> > qemu tries to map every guest page (create address translations), and
> > make sure the pages are available during runtime. virsh/libvirt by
> > default, seems to use "-mem-prealloc" option in case the guest is
> > configured to use huge pages. The patch tries to map all guest pages
> > simultaneously by spawning multiple threads. Currently limiting the
> > change to QEMU library functions on POSIX compliant host only, as we are
> > not sure if the problem exists on win32. Below are some stats with
> > "-mem-prealloc" option for guest configured to use huge pages.
> > 
> > ------------------------------------------------------------------------
> > Idle Guest      | Start-up time | Migration time
> > ------------------------------------------------------------------------
> > Guest stats with 2M HugePage usage - single threaded (existing code)
> > ------------------------------------------------------------------------
> > 64 Core - 4TB   | 54m11.796s    | 75m43.843s
> > 64 Core - 1TB   | 8m56.576s     | 14m29.049s
> > 64 Core - 256GB | 2m11.245s     | 3m26.598s
> > ------------------------------------------------------------------------
> > Guest stats with 2M HugePage usage - map guest pages using 8 threads
> > ------------------------------------------------------------------------
> > 64 Core - 4TB   | 5m1.027s      | 34m10.565s
> > 64 Core - 1TB   | 1m10.366s     | 8m28.188s
> > 64 Core - 256GB | 0m19.040s     | 2m10.148s
> > -----------------------------------------------------------------------
> > Guest stats with 2M HugePage usage - map guest pages using 16 threads
> > -----------------------------------------------------------------------
> > 64 Core - 4TB   | 1m58.970s     | 31m43.400s
> > 64 Core - 1TB   | 0m39.885s     | 7m55.289s
> > 64 Core - 256GB | 0m11.960s     | 2m0.135s
> > -----------------------------------------------------------------------
> > 
> > Changed in v2:
> >  - modify number of memset threads spawned to min(smp_cpus, 16).
> >  - removed 64GB memory restriction for spawning memset threads.
> > 
> > Signed-off-by: Jitendra Kolhe <address@hidden>
> > ---
> >  backends/hostmem.c   |  4 ++--
> >  exec.c               |  2 +-
> >  include/qemu/osdep.h |  3 ++-
> >  util/oslib-posix.c   | 68 
> > +++++++++++++++++++++++++++++++++++++++++++++++-----
> >  util/oslib-win32.c   |  3 ++-
> >  5 files changed, 69 insertions(+), 11 deletions(-)
> > 
> > diff --git a/backends/hostmem.c b/backends/hostmem.c
> > index 7f5de70..162c218 100644
> > --- a/backends/hostmem.c
> > +++ b/backends/hostmem.c
> > @@ -224,7 +224,7 @@ static void host_memory_backend_set_prealloc(Object 
> > *obj, bool value,
> >          void *ptr = memory_region_get_ram_ptr(&backend->mr);
> >          uint64_t sz = memory_region_size(&backend->mr);
> >  
> > -        os_mem_prealloc(fd, ptr, sz, &local_err);
> > +        os_mem_prealloc(fd, ptr, sz, smp_cpus, &local_err);
> >          if (local_err) {
> >              error_propagate(errp, local_err);
> >              return;
> > @@ -328,7 +328,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
> > Error **errp)
> >           */
> >          if (backend->prealloc) {
> >              os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz,
> > -                            &local_err);
> > +                            smp_cpus, &local_err);
> >              if (local_err) {
> >                  goto out;
> >              }
> > diff --git a/exec.c b/exec.c
> > index 8b9ed73..53afcd2 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -1379,7 +1379,7 @@ static void *file_ram_alloc(RAMBlock *block,
> >      }
> >  
> >      if (mem_prealloc) {
> > -        os_mem_prealloc(fd, area, memory, errp);
> > +        os_mem_prealloc(fd, area, memory, smp_cpus, errp);
> >          if (errp && *errp) {
> >              goto error;
> >          }
> > diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> > index 56c9e22..fb1d22b 100644
> > --- a/include/qemu/osdep.h
> > +++ b/include/qemu/osdep.h
> > @@ -401,7 +401,8 @@ unsigned long qemu_getauxval(unsigned long type);
> >  
> >  void qemu_set_tty_echo(int fd, bool echo);
> >  
> > -void os_mem_prealloc(int fd, char *area, size_t sz, Error **errp);
> > +void os_mem_prealloc(int fd, char *area, size_t sz, int smp_cpus,
> > +                     Error **errp);
> >  
> >  int qemu_read_password(char *buf, int buf_size);
> >  
> > diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> > index f631464..17da029 100644
> > --- a/util/oslib-posix.c
> > +++ b/util/oslib-posix.c
> > @@ -55,6 +55,16 @@
> >  #include "qemu/error-report.h"
> >  #endif
> >  
> > +#define MAX_MEM_PREALLOC_THREAD_COUNT 16
> running with -smp 16 or bigger on host with less than 16 cpus
> it would be not quite optimal.
> Why not to change MAX_MEM_PREALLOC_THREAD_COUNT constant to
> something like sysconf(_SC_NPROCESSORS_ONLN)

The point is to not consume more host resources than would otherwise
be consumed by running the guest CPUs. ie, if running a KVM guest
with -smp 4 on a 16 CPU host,  QEMU should not to consume more than
4 pCPUs worth of resource on the host.  Using sysconf would cause
the consume to consume all host resources, likely harming other
guests workloads.

If the person launching QEMU gives a -smp value that's larger than
the host CPUs count, then they've already accepted that they're
asking QEMU todo more than the host is really capable of. IOW, I
don't think we need to special case memsetting for that, since
VCPU execution itself is already going to overcommit the host.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up and migration time., Jitendra Kolhe, 2017/02/13
- Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up and migration time., Igor Mammedov, 2017/02/13
  - Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up and migration time., Daniel P. Berrange <=
    - Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up and migration time., Igor Mammedov, 2017/02/13
    - Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up and migration time., Jitendra Kolhe, 2017/02/13
    - Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up and migration time., Jitendra Kolhe, 2017/02/14
    - Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up and migration time., Igor Mammedov, 2017/02/14

Prev by Date: Re: [Qemu-devel] libslirp and QEMU slirp
Next by Date: Re: [Qemu-devel] [PULL 0/6] vga patch queue
Previous by thread: Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up and migration time.
Next by thread: Re: [Qemu-devel] [PATCH v2] mem-prealloc: reduce large guest start-up and migration time.
Index(es):
- Date
- Thread