qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [External] Re: [PATCH] hostmem: change default prealloc threads numb


From: Igor Mammedov
Subject: Re: [External] Re: [PATCH] hostmem: change default prealloc threads number
Date: Wed, 29 Sep 2021 16:46:45 +0200

On Wed, 29 Sep 2021 10:22:39 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Wed, Sep 29, 2021 at 11:05:31AM +0200, Igor Mammedov wrote:
> > On Tue, 28 Sep 2021 00:47:01 +0800
> > Nan Wang <wangnan.light@bytedance.com> wrote:
> >   
> > > On 2021/9/27 11:16, David Hildenbrand wrote:  
> > > > On 27.09.21 15:19, Nan Wang wrote:    
> > > >> From: "wangnan.light" <wangnan.light@bytedance.com>
> > > >>
> > > >> the default number of prealloc threads is 1, for huge memory backend
> > > >> file, single thread touch page is really slow.
> > > >> We can adjust thread number by prealloc-threads property, but if the
> > > >> default value updated to MachineState::smp::cpus may be better.
> > > >> For example, old version of qemu(prealloc-threads have not been
> > > >> introduced yet), the value of threads num is MachineState::smp::cpus,
> > > >> so if someone use the same commandline to start current verion of qemu
> > > >> and old version of qemu which will lead to different behaviors.    
> > > > 
> > > > The introducing patch mentions:
> > > > 
> > > > commit ffac16fab33bb42f17e47624985220c1fd864e9d
> > > > Author: Igor Mammedov <imammedo@redhat.com>
> > > > Date:   Wed Feb 19 11:09:50 2020 -0500
> > > > 
> > > >      hostmem: introduce "prealloc-threads" property
> > > > 
> > > >      the property will allow user to specify number of threads to use
> > > >      in pre-allocation stage. It also will allow to reduce implicit
> > > >      hostmem dependency on current_machine.
> > > >      On object creation it will default to 1, but via machine
> > > >      compat property it will be updated to MachineState::smp::cpus
> > > >      to keep current behavior for hostmem and main RAM (which is
> > > >      now also hostmem based).
> > > > 
> > > > So it looks like we want to do the latter via compat properties 
> > > > eventually.
> > > > 
> > > > However, I'd like to note that more prealloc threads might be good for 
> > > > large backends, and might be bad for small backends. To me, it feels 
> > > > like a workload that relies on this should really do this manually. So 
> > > > I 
> > > > am still not sure if this is the right thing to do.    
> > > Yes, I agree with you "more prealloc threas are good for large backends, 
> > > and bad for small backends". But I think most situation large backends 
> > > always with large vcpu numbers and small backens always with small vcpu 
> > > numbers, because most users will not create a vm with large vcpu numbers 
> > > with small memory.
> > > 
> > >   
> > > > 
> > > > Note that qapi/qom.json:
> > > > 
> > > > "@prealloc-threads: number of CPU threads to use for prealloc (default: 
> > > > 1", so that doc would be wrong now.
> > > > 
> > > > Why exactly can't workload that cares not simply set this manually? 
> > > > Performance tuning smells like something to be done manually for a 
> > > > specific workload.
> > > >    
> > > It is a simply way that let workload set the prealloc threads manually. 
> > > For example, for large backends it set many prealloc threads, and set 1 
> > > prealloc threads manually for small backends. Yes, workload can 
> > > `maunally` set prealloc thread to 1, rather than use `default` value 1.
> > > So when workload want to(or maybe just forget specify the 
> > > prealloc-threads property) use the default value, I think the 
> > > MachineState::smp::cpus maybe better than 1.  
> > 
> > as commit mentioned by David states, it creates implicit dependency
> > on Machine and we were working getting rid of such dependencies
> > where it's possible.
> > 
> > So if you have to change prealloc-threads to a larger number,
> > you can try to use specific machine compat properties to do it,
> > instead of pushing machine to generic backend code. But 'good'
> > default for your workload doesn't guaranties it's a good one
> > another.
> > 
> > My preference is that user (mgmt layer) should set property
> > explicitly if it cares. It's leads to more stable VM config,
> > as opposed to using defaults which could change over time and
> > unexpectedly 'regress' such VMs, and can factor in host/workload
> > specific nuances without need to change QEMU.  
> 
> Setting prealloc_threads to match vCPUs count feels like it is making
> an assumption that if we've allowed 4 vCPUs, it is OK for the prealloc
> to consume 4 host CPUs. This assumption could be tricky when QEMU is
> strictly pinned to host CPUs, as vCPU threads are pinned to some pCPUs
> but emulator threads might be pinned differently.
> 
> Would there still be a performance advantage to prealloc_threads > 1,
> if all non-vCPU threads are pinned to the same single host CPU ?

I'd imagine it will only introduce unnecessary task contention.
Current conservative default (1) is the best we can do without
knowing workload/host configuration, since it affects host and
already running VMs less than higher number of pre-allocation
threads.
 

> 
> 
> Regards,
> Daniel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]