qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-arm] [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error m


From: Igor Mammedov
Subject: Re: [Qemu-arm] [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Date: Tue, 28 May 2019 15:59:35 +0200

On Mon, 27 May 2019 15:52:30 +0200
Laurent Vivier <address@hidden> wrote:

> On 27/05/2019 14:50, Igor Mammedov wrote:
> > On Mon, 27 May 2019 08:55:49 +0200
> > Laurent Vivier <address@hidden> wrote:
> >   
> >> On 24/05/2019 22:14, Eduardo Habkost wrote:  
> >>> On Fri, May 24, 2019 at 04:39:12PM +0200, Laurent Vivier wrote:  
> >>>> On 24/05/2019 16:10, Igor Mammedov wrote:  
> >>>>> On Fri, 24 May 2019 12:35:21 +0200
> >>>>> Laurent Vivier <address@hidden> wrote:
> >>>>>     
> >>>>>> On pseries, core-ids are strongly binded to a node-id by the command
> >>>>>> line option. If an user tries to add a CPU to the wrong node, he has
> >>>>>> an error but it is not really helpful:
> >>>>>>
> >>>>>>      qemu-system-ppc64 ... -smp 
> >>>>>> 1,maxcpus=64,cores=1,threads=1,sockets=1 \
> >>>>>>                            -numa node,nodeid=0 -numa node,nodeid=1 ...
> >>>>>>
> >>>>>>      (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
> >>>>>>      Error: node-id=1 must match numa node specified with -numa option
> >>>>>>
> >>>>>> This patch improves this error message by giving to the user the good
> >>>>>> topology information (node-id, socket-id and thread-id if they are
> >>>>>> available) to use with the core-id he's providing:
> >>>>>>
> >>>>>>      Error: node-id=1 must match numa node specified with -numa option 
> >>>>>> 'node-id 0'
> >>>>>>
> >>>>>> Signed-off-by: Laurent Vivier <address@hidden>
> >>>>>> ---
> >>>>>>
> >>>>>> Notes:
> >>>>>>        v3: only add the topology to the existing message
> >>>>>>            As suggested by Igor replace
> >>>>>>              Error: core-id 30 can only be plugged into node-id 0
> >>>>>>            by
> >>>>>>              Error: node-id=1 must match numa node specified with 
> >>>>>> -numa option 'node-id 0'
> >>>>>>        v2: display full topology in the error message  
> >>>>>>>>>     numa.c | 25 ++++++++++++++++++++++++-  
> >>>>>>     1 file changed, 24 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/numa.c b/numa.c
> >>>>>> index 3875e1efda3a..7882ec294be4 100644
> >>>>>> --- a/numa.c
> >>>>>> +++ b/numa.c
> >>>>>> @@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error 
> >>>>>> **errp)
> >>>>>>         set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
> >>>>>>     }
> >>>>>> +static char *cpu_topology_to_string(const CPUArchId *cpu)
> >>>>>> +{
> >>>>>> +    GString *s = g_string_new(NULL);
> >>>>>> +    if (cpu->props.has_socket_id) {
> >>>>>> +        g_string_append_printf(s, "socket-id %"PRId64, 
> >>>>>> cpu->props.socket_id);
> >>>>>> +    }
> >>>>>> +    if (cpu->props.has_node_id) {
> >>>>>> +        if (s->len) {
> >>>>>> +            g_string_append_printf(s, ", ");
> >>>>>> +        }
> >>>>>> +        g_string_append_printf(s, "node-id %"PRId64, 
> >>>>>> cpu->props.node_id);
> >>>>>> +    }
> >>>>>> +    if (cpu->props.has_thread_id) {
> >>>>>> +        if (s->len) {
> >>>>>> +            g_string_append_printf(s, ", ");
> >>>>>> +        }
> >>>>>> +        g_string_append_printf(s, "thread-id %"PRId64, 
> >>>>>> cpu->props.thread_id);
> >>>>>> +    }
> >>>>>> +    return g_string_free(s, false);
> >>>>>> +}  
> >>>>>
> >>>>> turns out we already have such helper: cpu_slot_to_string()  
> >>>>
> >>>> It doesn't display the node-id but the core-id. And node-id is what we 
> >>>> need
> >>>> to know.  
> >>>
> >>> I'm confused about what you are trying to do here.
> >>>
> >>> On v1, the message looked like:
> >>>     Error: core-id 30 can only be plugged into node-id 0
> >>>
> >>> which is probably good for spapr.
> >>>
> >>>
> >>> Then I suggested you added the other cpu->props fields.  e.g. on
> >>> PC the message would look like:
> >>>     Error: socket-id 20, core-id 30, thread-id 40 can only be plugged 
> >>> into node-id 0
> >>>
> >>>
> >>> But you sent a v2 patch that would print this on PC:
> >>>     Error: core-id 30 can only be plugged into socket-id 20, node-id 0, 
> >>> thread-id 40
> >>>
> >>> which doesn't make sense to me.
> >>>
> >>>
> >>> Then in a reply to v2, Igor suggested:
> >>>
> >>>    error_setg(errp, "node-id=%d must match numa node specified "
> >>>                      "with -numa option '%s'", node_id, topology);
> >>>
> >>>
> >>> Igor suggest would address the problem above.  I expected it to become:
> >>>     node-id=0 must match numa node specified with -numa option core-id=30
> >>> and on PC:
> >>>     node-id=0 must match numa node specified with -numa option 
> >>> socket-id=20,core-id=30,thread-id=40
> >>>
> >>> Or maybe it could include the input node-id too:
> >>>     node-id=0 must match numa node specified with -numa option 
> >>> node-id=1,core-id=30
> >>> and on PC:
> >>>     node-id=0 must match numa node specified with -numa option 
> >>> node-id=1,socket-id=20,core-id=30,thread-id=40
> >>>
> >>> Both options would work.
> >>>
> >>>
> >>> But you implemented code that would print:
> >>>     Error: node-id=0 must match numa node specified with -numa option 
> >>> 'node-id 1'
> >>> and on PC it would print:
> >>>     Error: node-id=0 must match numa node specified with -numa option 
> >>> 'socket-id 20 node-id 1 thread-id=40'
> >>>
> >>> which doesn't make sense to me.
> >>>
> >>>
> >>> I was expecting something like:
> >>>     Error: CPU slot core-id=30 is bound to node-id 0, but node-id 1 was 
> >>> specified
> >>> and on PC:
> >>>     Error: CPU slot socket-id=20,core-id=30,thread-id=40 is bound to 
> >>> node-id 0, but node-id 1 was specified
> >>>
> >>>      
> >>
> >> The idea is to provide the information to the user to help him to know
> >> where the cpu can be plugged when it cannot on the node-id he originally
> >> provided.
> >>
> >> So all the solutions you propose sounds good to me.
> >>
> >> I only need you and Igor agree on the same one.  
> > 
> > We with Eduardo basically agree on contents/set of properties to print,
> > it is only different phrasing (Eduardo's suggestion is better than what we 
> > have now).
> > But lets get to what problem you are going to fix/improve. SO I've went 
> > ahead and tried
> > with following CLI:
> > 
> >     qemu-system-x86_64 -smp 1,maxcpus=4 -numa node,cpus=0-1 -numa 
> > node,cpus=2-3 -monitor stdio -device 
> > qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1
> > 
> > end it errored out with:
> > 
> >    qemu-system-x86_64: -device 
> > qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: node-id=1 
> > must match numa node specified with -numa option
> > 
> > As you see we already have all user provide properties for cpu (including 
> > invalid ones) reported,
> > what we are missing is suggestion for valid node-id. How about following 
> > error message:
> > 
> > qemu-system-x86_64: -device 
> > qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: invalid 
> > node-id, must be 0  
> 
> The case I'm worrying about is when the cpu is hotplugged: we don't have the 
> "-device ..." information.
> 
>   $ qemu-system-ppc64 -nodefaults -nographic -monitor stdio -m 1G -smp 
> 1,maxcpus=64,cores=1,threads=1,sockets=1 -numa node,nodeid=0 -numa 
> node,nodeid=1
>   QEMU 3.0.1 monitor - type 'help' for more information
>   (qemu) device_add power8_v2.0-spapr-cpu-core,core-id=30,node-id=1
>   node-id=1 must match numa node specified with -numa option
> 
> So you can see the needed information is missing.

device-add is synchronous command so user (monitor) has a invalid properties 
command right above error, similar thing applies to QMP where user can match 
command with reply.
Repeating device properties looks to me like unnecessary date duplication.

> 
> Thanks,
> Laurent




reply via email to

[Prev in Thread] Current Thread [Next in Thread]