Re: [Qemu-devel] [RFC] QMP: add query-hotpluggable-cpus

qemu-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC] QMP: add query-hotpluggable-cpus

From:	Igor Mammedov
Subject:	Re: [Qemu-devel] [RFC] QMP: add query-hotpluggable-cpus
Date:	Thu, 25 Feb 2016 13:43:05 +0100
On Thu, 25 Feb 2016 12:25:43 +1100
David Gibson <address@hidden> wrote:

> On Wed, Feb 24, 2016 at 03:17:54PM +0100, Igor Mammedov wrote:
> > On Wed, 24 Feb 2016 12:54:17 +1100
> > David Gibson <address@hidden> wrote:
> >   
> > > On Tue, Feb 23, 2016 at 10:46:45AM +0100, Igor Mammedov wrote:  
> > > > On Mon, 22 Feb 2016 13:54:32 +1100
> > > > David Gibson <address@hidden> wrote:
> > > >     
> > > > > On Fri, Feb 19, 2016 at 04:49:11PM +0100, Igor Mammedov wrote:    
> > > > > > On Fri, 19 Feb 2016 15:38:48 +1100
> > > > > > David Gibson <address@hidden> wrote:
> > > > > > 
> > > > > > CCing thread a couple of libvirt guys.
> > > > > >       
> > > > > > > On Thu, Feb 18, 2016 at 11:37:39AM +0100, Igor Mammedov wrote:    
> > > > > > >   
> > > > > > > > On Thu, 18 Feb 2016 14:39:52 +1100
> > > > > > > > David Gibson <address@hidden> wrote:
> > > > > > > >         
> > > > > > > > > On Tue, Feb 16, 2016 at 11:36:55AM +0100, Igor Mammedov 
> > > > > > > > > wrote:        
> > > > > > > > > > On Mon, 15 Feb 2016 20:43:41 +0100
> > > > > > > > > > Markus Armbruster <address@hidden> wrote:
> > > > > > > > > >           
> > > > > > > > > > > Igor Mammedov <address@hidden> writes:
> > > > > > > > > > >           
> > > > > > > > > > > > it will allow mgmt to query present and possible to 
> > > > > > > > > > > > hotplug CPUs
> > > > > > > > > > > > it is required from a target platform that wish to 
> > > > > > > > > > > > support
> > > > > > > > > > > > command to set board specific 
> > > > > > > > > > > > MachineClass.possible_cpus() hook,
> > > > > > > > > > > > which will return a list of possible CPUs with options
> > > > > > > > > > > > that would be needed for hotplugging possible CPUs.
> > > > > > > > > > > >
> > > > > > > > > > > > For RFC there are:
> > > > > > > > > > > >    'arch_id': 'int' - mandatory unique CPU number,
> > > > > > > > > > > >                       for x86 it's APIC ID for ARM it's 
> > > > > > > > > > > > MPIDR
> > > > > > > > > > > >    'type': 'str' - CPU object type for usage with 
> > > > > > > > > > > > device_add
> > > > > > > > > > > >
> > > > > > > > > > > > and a set of optional fields that would allows mgmt 
> > > > > > > > > > > > tools
> > > > > > > > > > > > to know at what granularity and where a new CPU could be
> > > > > > > > > > > > hotplugged;
> > > > > > > > > > > > [node],[socket],[core],[thread]
> > > > > > > > > > > > Hopefully that should cover needs for CPU hotplug 
> > > > > > > > > > > > porposes for
> > > > > > > > > > > > magor targets and we can extend structure in future 
> > > > > > > > > > > > adding
> > > > > > > > > > > > more fields if it will be needed.
> > > > > > > > > > > >
> > > > > > > > > > > > also for present CPUs there is a 'cpu_link' field which
> > > > > > > > > > > > would allow mgmt inspect whatever object/abstraction
> > > > > > > > > > > > the target platform considers as CPU object.
> > > > > > > > > > > >
> > > > > > > > > > > > For RFC purposes implements only for x86 target so far. 
> > > > > > > > > > > >            
> > > > > > > > > > > 
> > > > > > > > > > > Adding ad hoc queries as we go won't scale.  Could this 
> > > > > > > > > > > be solved by a
> > > > > > > > > > > generic introspection interface?          
> > > > > > > > > > Do you mean generic QOM introspection?
> > > > > > > > > > 
> > > > > > > > > > Using QOM we could have '/cpus' container and create QOM 
> > > > > > > > > > links
> > > > > > > > > > for exiting (populated links) and possible (empty links) 
> > > > > > > > > > CPUs.
> > > > > > > > > > However in that case link's name will need have a special 
> > > > > > > > > > format
> > > > > > > > > > that will convey an information necessary for mgmt to 
> > > > > > > > > > hotplug
> > > > > > > > > > a CPU object, at least:
> > > > > > > > > >   - where: [node],[socket],[core],[thread] options
> > > > > > > > > >   - optionally what CPU object to use with device_add 
> > > > > > > > > > command          
> > > > > > > > > 
> > > > > > > > > Hmm.. is it not enough to follow the link and get the topology
> > > > > > > > > information by examining the target?        
> > > > > > > > One can't follow a link if it's an empty one, hence
> > > > > > > > CPU placement information should be provided somehow,
> > > > > > > > either:        
> > > > > > > 
> > > > > > > Ah, right, so the issue is determining the socket/core/thread
> > > > > > > addresses that cpus which aren't yet present will have.
> > > > > > >       
> > > > > > > >  * by precreating cpu-package objects with properties that
> > > > > > > >    would describe it /could be inspected via OQM/        
> > > > > > > 
> > > > > > > So, we could do this, but I think the natural way would be to 
> > > > > > > have the
> > > > > > > information for each potential thread in the package.  Just 
> > > > > > > putting
> > > > > > > say "core number" in the package itself assumes more than I'd like
> > > > > > > about how packages sit in the heirarchy.  Plus, it means that
> > > > > > > management has a bunch of cases to deal with: package has all the
> > > > > > > information, package has just a core id, package has just a 
> > > > > > > socket id,
> > > > > > > and so forth.
> > > > > > > 
> > > > > > > It is a but clunky that when the package is plugged, this 
> > > > > > > information
> > > > > > > will have to sit parallel to the array of actual thread links.
> > > > > > >
> > > > > > > Markus or Andreas is there a natural way to present a list of 
> > > > > > > (node,
> > > > > > > socket, core, thread) tuples in the package object?  Preferably
> > > > > > > without having to create a whole bunch of "potential thread" 
> > > > > > > objects
> > > > > > > just for the purpose.      
> > > > > > I'm sorry but I couldn't parse above 2 paragraphs. The way I see
> > > > > > whatever placement info QEMU will provide to mgmt, mgmt will have
> > > > > > to deal with it in one way or another.
> > > > > > Perhaps rephrasing and adding some examples might help to explain
> > > > > > suggestion a bit better?      
> > > > > 
> > > > > Ok, so what I'm saying is that I think describing a location for the
> > > > > package itself could be problematic.  For some cases it will be ok,
> > > > > but depending on exactly what the package represents on a particular
> > > > > platform there could be a lot of options for how to represent it.
> > > > > 
> > > > > What I'm suggesting instead is that instead of giving a location for
> > > > > itself, the package lists the locations of all the threads it will
> > > > > contain when it is enabled/present/whatever.  Those locations can be
> > > > > given as node/socket/core/thread tuples - which are properties that
> > > > > cpu threads already need to have, so we're not making the possible
> > > > > inadequacy of that information any worse than it already was.
> > > > > 
> > > > > Examples.. so I'm not really sure how to write QOM objects, but I hope
> > > > > this is clear enough:
> > > > > 
> > > > > On x86
> > > > >       .../cpu-package[0]              (type 'acpi-thread')
> > > > >              present = true
> > > > >              location[0] = (node 0, socket 0, core 0, thread 0)
> > > > >              thread[0] = <link to cpu thread object>
> > > > >       .../cpu-package[1]              (type 'acpi-thread')
> > > > >              present = false
> > > > >              location[0] = (node 0, socket 0, core 0, thread 1)
> > > > > 
> > > > > On Power
> > > > >       .../cpu-package[0]              (type 'spapr-core')
> > > > >              present = true
> > > > >              location[0] = (node 0, socket 0, core 0, thread 0)
> > > > >              location[1] = (node 0, socket 0, core 0, thread 1)
> > > > >              ...
> > > > >              location[7] = (node 0, socket 0, core 0, thread 7)
> > > > >              thread[0] = <link...>
> > > > >              ...
> > > > >              thread[7] = >link...>
> > > > >       .../cpu-package[1]              (type 'spapr-core')
> > > > >              present = false
> > > > >              location[0] = (node 0, socket 0, core 0, thread 0)
> > > > >              location[1] = (node 0, socket 0, core 0, thread 1)
> > > > >              ...
> > > > >              location[7] = (node 0, socket 0, core 0, thread 7)
> > > > > 
> > > > > Does that make sense?
> > > > >     
> > > > > > > > or
> > > > > > > >  * via QMP/HMP command that would provide the same information
> > > > > > > >    only without need to precreate anything. The only difference
> > > > > > > >    is that it allows to use -device/device_add for new CPUs.    
> > > > > > > >     
> > > > > > > 
> > > > > > > I'd be ok with that option as well.  I'd be thinking it would be
> > > > > > > implemented via a class method on the package object which 
> > > > > > > returns the
> > > > > > > addresses that its contained threads will have, whether or not 
> > > > > > > they're
> > > > > > > present right now.  Does that make sense?      
> > > > > > In this RFC it's MachineClass.possible_cpus method which is a bit 
> > > > > > more
> > > > > > flexible as it allows a board to describe possible CPU devices 
> > > > > > (whatever
> > > > > > they might be: sockets|cores|threads|some_chip_module) and their 
> > > > > > properties
> > > > > > without forcing board to precreate cpu_package objects which should 
> > > > > > convey
> > > > > > the same info one way or another.      
> > > > > 
> > > > > Hmm.. so my RFC so far (at least the revised version based on
> > > > > Eduardo's comments) is that the cpu_package objects are always
> > > > > precreated.  In future we might allow dynamic construction, but that
> > > > > will require a bunch more thinking to designt the right interfaces.
> > > > >     
> > > > > > > > Considering that we would need to create HMP command so user 
> > > > > > > > could
> > > > > > > > inspect possible CPUs from monitor, it would need to do the 
> > > > > > > > same as
> > > > > > > > QMP command regardless of whether it's cpu-package objects or
> > > > > > > > just board calculated info a runtime.
> > > > > > > >          
> > > > > > > > > In the design Eduardo and I have been discussing we're 
> > > > > > > > > actually not
> > > > > > > > > planning to allow device_add to construct CPU packages - at 
> > > > > > > > > least, not
> > > > > > > > > for the time being.  The idea is that the machine type will 
> > > > > > > > > construct
> > > > > > > > > enough packages for maxcpus, and management just toggles them 
> > > > > > > > > on and
> > > > > > > > > off.        
> > > > > > > > Another question is how it would work wrt migration?        
> > > > > > > 
> > > > > > > I'm assuming the "present" bits would be added to the migration
> > > > > > > stream; seems straightforward enough to me.  Is there some
> > > > > > > consideration I'm missing?      
> > > > > > It's hard to estimate how cpu-package objects might complicate
> > > > > > migration. It should not break migration for old machine types
> > > > > > and if possible it should work for backwards migration to older
> > > > > > QEMU versions (to be downstream friendly).      
> > > > > 
> > > > > So, the simple way to achieve that is to only instantiate the
> > > > > cpu-package objects on newer machine types.  Older machine types will
> > > > > instatiate the cpu threads directly from the machine type in the old
> > > > > way, and (except for x86) won't allow cpu hotplug.
> > > > > 
> > > > > I think that's a reasonable first approach.  Later we can look at
> > > > > migrating a non-package setup to a package setup, if it looks like
> > > > > that will be useful.
> > > > >     
> > > > > > If we go typical '-device/device_add 
> > > > > > whatever_cpu_device,foo_options_list'
> > > > > > route then it would allow us to replicate older device models 
> > > > > > without
> > > > > > issues (I don't expect any in x86 case) as it's what CPUs are now 
> > > > > > under the hood.
> > > > > > This RFC doesn't force us to re-factor device models in order to use
> > > > > > hotplug (where CPU objects are already self-sufficient 
> > > > > > devices/hotplug capable).
> > > > > > 
> > > > > > It rather tries completely split interface aspect from how we are
> > > > > > internally model CPU hotplug, and tries to solve issue with
> > > > > > 
> > > > > >  -device/device_add for which we need to provide
> > > > > >    'what type to plug' and 'where to plug, which options to set to 
> > > > > > what'
> > > > > > 
> > > > > > It's 1st level per you proposal, later we can do 2nd level on top 
> > > > > > of it
> > > > > > using cpu-packages(flip present property) to simplify mgmt's job
> > > > > > if it still would really needed (i.e. mgmt won't be able to cope 
> > > > > > with
> > > > > > -device, which it already has support for).      
> > > > > 
> > > > > Yeah.. so the thing is, in the short term I'm really more interested
> > > > > in the 2nd layer interface.  It's something we can actually use,
> > > > > whereas the 1st layer interfaace still has a lot of potential
> > > > > complications.    
> > > > What complications do you see from POWER point if view?    
> > > 
> > > I don't relaly see any complications specific to Power.  But the
> > > biggest issue, as far as I can tell is how do we advertise to the user
> > > / management layer what sorts of CPUs can be hotplugged - how many,
> > > what types are possible and so forth.  The constraints here could in
> > > theory be pretty complex.  
> > that's what query-hotpluggable-cpus does, but not for theoretical
> > set of platforms but rather a practical set that we a wanting
> > CPU hotplug for.
> >  i.e. board returns a fixed board layout describing what cpu types
> >  could be hotplugged and where at in terms of [socket/core/thread]
> >  tuples, which maps well to current targets which need CPU hotplug
> >  (power/s390/x86/ARM).
> > 
> > The rest of interface (i.e.) add/remove actions are handled by
> > reused -device/device_add - that mgmt has already support for and
> > works pretty well for migration as well
> > (no need to maintain machine version-ed compat glue is plus).
> > 
> > So any suggestions how to improve layout description returned
> > by query-hotpluggable-cpus command are welcome.
> > Even if we end up using QOM interface, suggestions will still
> > be useful as the other interface will need to convey the same info
> > just via other means.  
> 
> Yeah, as I mentioned elsewhere, I'm starting to come around to this
> basic approach, although I'm still a bit dubious about the specific
> format suggested.  I don't have specific suggestions to improve it
> yet, but I'm working on it :).
> 
> 
> > > > > This is why Eduardo suggested - and I agreed - that it's probably
> > > > > better to implement the "1st layer" as an internal structure/interface
> > > > > only, and implement the 2nd layer on top of that.  When/if we need to
> > > > > we can revisit a user-accessible interface to the 1st layer.    
> > > > We are going around QOM based CPU introspecting interface for
> > > > years now and that's exactly what 2nd layer is, just another
> > > > implementation. I've just lost hope in this approach.
> > > > 
> > > > What I'm suggesting in this RFC is to forget controversial
> > > > QOM approach for now and use -device/device_add + QMP introspection,
> > > > i.e. completely split interface from how boards internally implement
> > > > CPU hotplug.    
> > > 
> > > I can see the appeal of that approach at this juncture.  Hmm..  
> > A lot of work has been done to make CPUs device_add compatible.  
> 
> So... it's been much discussed, but I'm still pretty unclear on how
> the device_add interface is supposed to work; at least in the context
> of non thread-granularity hotplug.
> 
> Basically, is it acceptable for:
>       device_add vendor-model-cpu-core
> 
> to create, in addition to the core device, a bunch of additional
> devices (the individual threads), or is that the "object mutating its
> own topology" that Andreas objects to violently?
I think it's acceptable to have vendor-model-cpu-core device
considering it's platform limitation or socket if device model calls for it.
I'm not sure that mutating applies to all objects but for Device
inherited classes there shouldn't be any.
i.e.
 1. create Device with instance_init - constructor that shouldn't fail ever
 2. set properties -
      done by -device/device_add and also by device_post_init() for globals
 3. set 'realize' property to ON - allowed to fail, completes device 
initialization
    realize() hook must validate set earlier properties if it hasn't been
    done earlier and complete all child objects initialization,
    children are should be at 'realized' state when parent's realize()
    hook finishes without error. No further children are allowed to be
    created and not properties are allowed to be set after Device is realized.
 4. Once realize() hook is executed, Device core code calls
    plug hook if it supported hotplug_handler_plug() which usually
    does the job of wiring Device to board. For more details see
    device_set_realized().

On top of that Andreas would like that children weren't dynamically
allocated but embedded into parent, included in parent's
instance_size if possible i.e. children count is known at
instance_init() time.

> If that is acceptable, where exactly should it be done?  In the
> device's instance_init? in realize? somewhere else?
Not sure what question is about, does above answer it?
 
> > The missing piece is letting mgmt to know what CPUs and with
> > which options could be plugged in.  
> 
> Well, that's *a* missing piece, certainly..
> 
> > And adding a query-hotpluggable-cpus QMP command looks like
> > a path of the least resistance that would work for power/s390/x86/ARM.
> >   
>
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [Qemu-devel] [RFC] QMP: add query-hotpluggable-cpus, (continued)
Prev by Date: [Qemu-devel] [PATCH] MAINTAINERS: Add entry for include/sysemu/kvm*.h
Next by Date: [Qemu-devel] [PATCH] MAINTAINERS: Add an entry for the include/sysemu/rng*.h files
Previous by thread: Re: [Qemu-devel] [RFC] QMP: add query-hotpluggable-cpus
Next by thread: Re: [Qemu-devel] [RFC] QMP: add query-hotpluggable-cpus
Index(es):
- Date
- Thread