qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 00/52] Introduce hybrid CPU topology


From: Daniel P . Berrangé
Subject: Re: [RFC 00/52] Introduce hybrid CPU topology
Date: Mon, 13 Feb 2023 13:38:28 +0000
User-agent: Mutt/2.2.9 (2022-11-12)

On Mon, Feb 13, 2023 at 05:49:43PM +0800, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> ## 3.3. "-hybrid" command
> 
> For hybrid cpu topology configuration, the original "-smp" lack of
> flexibility to expand, and unables to customize different cores.
> 
> So we introduce "-hybrid" command and implement it as the multi-
> line command. The multi-line command format is more complex than the
> single-line one, but it can bring stronger scalability and
> intuitiveness. In the future, it can also be easily extended to more
> heterogeneous topology levels.
> 
> "-hybrid" command is as follows:
> 
> -hybrid socket,sockets=n
> -hybrid die,dies=n
> -hybrid cluster,clusters=n
> -hybrid core,cores=n,type=core_type[,threads=threads]
>         [,clusterid=cluster]
> 
> Here, we first define the corresponding qapi options for these 4
> topology levels: core, cluster, die and socket.
> 
> We doesn't need a thread level since thread doesn't have different
> type.
> 
> For example:
> 
> -hybrid socket,sockets=1
> -hybrid die,dies=1
> -hybrid cluster,clusters=4
> -hybrid core,cores=1,coretype="core",threads=2,clusterid=0-2
> -hybrid core,cores=4,coretype="atom",threads=1
> 
> Here we can build a hybrid cpu topology, which has 1 socket, 1 die per
> socket, 4 clusters per die. And in each die, every clusters has 4 "atom"
> core with 1 threads, and cluster0, cluster1 and cluster2 have 1 "core"
> cores with 2 threads.

How will this interact with the -cpu  parameter. Presumably we now
need two distinct -cpu parameters, one for the 'core' CPU model and
one for the 'atom' CPU model ?

> Please note this example is not an actual machine topology, but it shows
> the powerful flexibility of "hybrid" command.

IIUC the functionality offered by -hybrid should be a superset
of the -smp functionality. IOW, -smp ought to be possible to
re-implement -smp as an alias for -hybrid, such that internally
code only ever has to deal with the modern approach. Having to
keep support for both -smp and -hybrid throughout the code is
undesirable IMHO. Keeping the compat at the CLI parsing level
limits the burden.


As a more general thought, rather than introducing a new top level
command line argument -hybrid, I'm thinking we should possibly just
define this all using QOM and thus re-use the existing -object
argument. 

I'm also finding the above example command lines quite difficult
to understand, as there is alot of implicit linkage and expansion
between the different levels. With devices we're much more
explicitly with the parent/child relationships, and have to
express everything with no automatic expansion, linking it all
together via the id=/bus= properties.  This is quite a bit more
verbose, but it is also very effective at letting us express
arbitrarily complex relationships.

I think it would be worth exploring that approach for the CPU
topology expression too.

If we followed the more explicit device approach to modelling
then instead of:

 -cpu core,...
 -cpu atom,...
 -hybrid socket,sockets=1
 -hybrid die,dies=1
 -hybrid cluster,clusters=4
 -hybrid core,cores=1,coretype="core",threads=2,clusterid=0-2
 -hybrid core,cores=4,coretype="atom",threads=1

we would end up with something like

  -object cpu-socket,id=sock0
  -object cpu-die,id=die0,parent=sock0
  -object cpu-cluster,id=cluster0,parent=die0
  -object cpu-cluster,id=cluster1,parent=die0
  -object cpu-cluster,id=cluster2,parent=die0
  -object cpu-cluster,id=cluster3,parent=die0
  -object x86-cpu-model-atom,id=cpu0,parent=cluster0
  -object x86-cpu-model-atom,id=cpu1,parent=cluster0
  -object x86-cpu-model-atom,id=cpu2,parent=cluster0
  -object x86-cpu-model-atom,id=cpu3,parent=cluster0
  -object x86-cpu-model-core,id=cpu4,parent=cluster0,threads=2
  -object x86-cpu-model-atom,id=cpu5,parent=cluster1
  -object x86-cpu-model-atom,id=cpu6,parent=cluster1
  -object x86-cpu-model-atom,id=cpu7,parent=cluster1
  -object x86-cpu-model-atom,id=cpu8,parent=cluster1
  -object x86-cpu-model-core,id=cpu9,parent=cluster1,threads=2
  -object x86-cpu-model-atom,id=cpu10,parent=cluster2
  -object x86-cpu-model-atom,id=cpu11,parent=cluster2
  -object x86-cpu-model-atom,id=cpu12,parent=cluster2
  -object x86-cpu-model-atom,id=cpu13,parent=cluster2
  -object x86-cpu-model-core,id=cpu14,parent=cluster2,threads=2
  -object x86-cpu-model-atom,id=cpu15,parent=cluster3
  -object x86-cpu-model-atom,id=cpu16,parent=cluster3
  -object x86-cpu-model-atom,id=cpu17,parent=cluster3
  -object x86-cpu-model-atom,id=cpu18,parent=cluster3
  -object x86-cpu-model-core,id=cpu19,parent=cluster3,threads=2

The really obvious downside is that it is much more verbose.

This example only has 20 CPUs. For a VM with say 1000 CPUs
this will be very big, but that doesn't neccesarily make it
wrong.

On the flipside

 * It is really clear exactly how many CPUs I've added

 * The relationship between the topology levels is clear

 * Every CPU has a unique ID given that can be used in
   later QMP commands

 * Whether or not 'threads' are permitted is now a property
   of the specific CPU model implementation, not the global
   config. IOW we can express that some CPU models allowing
   for threads, and some don't.

 * The -cpu arg is also obsoleted, replaced by the
   -object x86-cpu-model-core. This might facilitate the
   modelling of machines with CPUs from different architectures.


We could potentially compress the leaf node level by expressing
how many instances of an object we want. it we want. ie, define
a more convenient shorthand syntax to creating many instances of
an object. so eg

  -object-set $TYPE,$PROPS,idbase=foo,count=4

would be functionally identical to

  -object $TYPE,$PROPS,id=foo.0
  -object $TYPE,$PROPS,id=foo.1
  -object $TYPE,$PROPS,id=foo.2
  -object $TYPE,$PROPS,id=foo.3

QEMU just expands it and creates all the objects internally.

So the huge example I have above for 20 cpus would become much
shorter: e.g.

  -object cpu-socket,id=sock0
  -object cpu-die,id=die0,parent=sock0
  -object cpu-cluster,id=cluster0,parent=die0
  -object cpu-cluster,id=cluster1,parent=die0
  -object cpu-cluster,id=cluster2,parent=die0
  -object cpu-cluster,id=cluster3,parent=die0
  -object-set x86-cpu-core-atom,idbase=cpu0,parent=cluster0,count=4
  -object-set x86-cpu-core-core,id=cpu1,parent=cluster0,threads=2,count=1
  -object-set x86-cpu-core-atom,idbase=cpu2,parent=cluster1,count=4
  -object-set x86-cpu-core-core,id=cpu3,parent=cluster1,threads=2,count=1
  -object-set x86-cpu-core-atom,idbase=cpu4,parent=cluster2,count=4
  -object-set x86-cpu-core-core,id=cpu5,parent=cluster2,threads=2,count=1
  -object-set x86-cpu-core-atom,idbase=cpu6,parent=cluster3,count=4
  -object-set x86-cpu-core-core,id=cpu7,parent=cluster3,threads=2,count=1

IOW, the size of the CLI config only depends on the number of elements
in the hierarchy, and is independant of the number of leaf CPU cores.

Obviously in describing all of the above, I've ignored any complexity
of dealing with our existing code implementation and pain of getting
it converted to the new model.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]