qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-6.2 v6 6/7] spapr: use DEVICE_UNPLUG_ERROR to report unpl


From: Markus Armbruster
Subject: Re: [PATCH for-6.2 v6 6/7] spapr: use DEVICE_UNPLUG_ERROR to report unplug errors
Date: Mon, 23 Aug 2021 15:33:19 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Daniel Henrique Barboza <danielhb413@gmail.com> writes:

> On 8/7/21 11:06 AM, Markus Armbruster wrote:
>> Daniel Henrique Barboza <danielhb413@gmail.com> writes:
>> 
>>> Linux Kernel 5.12 is now unisolating CPU DRCs in the device_removal
>>> error path, signalling that the hotunplug process wasn't successful.
>>> This allow us to send a DEVICE_UNPLUG_ERROR in drc_unisolate_logical()
>>> to signal this error to the management layer.
>>>
>>> We also have another error path in spapr_memory_unplug_rollback() for
>>> configured LMB DRCs. Kernels older than 5.13 will not unisolate the LMBs
>>> in the hotunplug error path, but it will reconfigure them. Let's send
>>> the DEVICE_UNPLUG_ERROR event in that code path as well to cover the
>>> case of older kernels.
>>>
>>> Reviewed-by: Greg Kurz <groug@kaod.org>
>>> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
>>> ---
>>>   hw/ppc/spapr.c     |  9 ++++++++-
>>>   hw/ppc/spapr_drc.c | 18 ++++++++++++------
>>>   2 files changed, 20 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 1611d7ab05..5459f9a7e9 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -29,6 +29,7 @@
>>>   #include "qemu/datadir.h"
>>>   #include "qapi/error.h"
>>>   #include "qapi/qapi-events-machine.h"
>>> +#include "qapi/qapi-events-qdev.h"
>>>   #include "qapi/visitor.h"
>>>   #include "sysemu/sysemu.h"
>>>   #include "sysemu/hostmem.h"
>>> @@ -3686,13 +3687,19 @@ void spapr_memory_unplug_rollback(SpaprMachineState 
>>> *spapr, DeviceState *dev)
>>>         /*
>>>        * Tell QAPI that something happened and the memory
>>> -     * hotunplug wasn't successful.
>>> +     * hotunplug wasn't successful. Keep sending
>>> +     * MEM_UNPLUG_ERROR even while sending DEVICE_UNPLUG_ERROR
>>> +     * until the deprecation MEM_UNPLUG_ERROR is due.
>>>        */
>>>       if (dev->id) {
>>>           qapi_error = g_strdup_printf("Memory hotunplug rejected by the 
>>> guest "
>>>                                        "for device %s", dev->id);
>>>           qapi_event_send_mem_unplug_error(dev->id, qapi_error);
>>>       }
>>> +
>>> +    qapi_event_send_device_unplug_error(!!dev->id, dev->id,
>>> +                                        dev->canonical_path,
>>> +                                        qapi_error != NULL, qapi_error);
>>>   }
>>>   
>> When dev->id is null, we send something like
>>
>>      {"event": "DEVICE_UNPLUG_ERROR",
>>       "data": {"path": "/machine/..."},
>>       "timestamp": ...}
>>
>> Unless I'm missing something, this is all the information the management
>> application really needs.
>>
>> When dev->id is non-null, we add to "data":
>>
>>                "device": "dev123",
>>                "msg": "Memory hotunplug rejected by the guest for device 
>> dev123",
>>
>> I'm fine with emitting the device ID when we have it.
>>
>> What's the intended use of "msg"?
>>
>> Could DEVICE_UNPLUG_ERROR ever be emitted for this device with a
>> different "msg"?
>
>
> It won't have a different 'msg' for the current use of the event in both ppc64
> and x86. It'll always be the same '<dev> hotunplug rejected by the guest'
> message.
>
> The idea is that a future caller might want to insert a more informative
> message, such as "hotunplug failed: memory is being used by kernel space"
> or any other more specific condition. But then I guess we can argue that,
> if that time comes, one can just add this new optional 'msg' member in this
> event, and for now we can live without it.
>
> Would you oppose to renaming this new event to "DEVICE_UNPLUG_GUEST_ERROR"
> and then remove the 'msg' member? I guess this rename would make it clearer
> for management that we're reporting a guest side error, making any further
> clarifications via 'msg' unneeded.

No objection.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]