qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/2] qapi/machine.json: add DEVICE_UNPLUG_ERROR QAPI event


From: Markus Armbruster
Subject: Re: [PATCH v2 1/2] qapi/machine.json: add DEVICE_UNPLUG_ERROR QAPI event
Date: Thu, 17 Jun 2021 07:59:07 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Daniel Henrique Barboza <danielhb413@gmail.com> writes:

> On 6/11/21 9:12 AM, Markus Armbruster wrote:
>> Daniel Henrique Barboza <danielhb413@gmail.com> writes:
>> 
>>> At this moment we only provide one event to report a hotunplug error,
>>> MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries
>>> machine is now able to report unplug errors for other device types, such
>>> as CPUs.
>>>
>>> Instead of creating a (device_type)_UNPLUG_ERROR for each new device,
>>> create a generic DEVICE_UNPLUG_ERROR event that can be used by all
>>> unplug errors in the future.
>>>
>>> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
>>> ---
>>>   qapi/machine.json | 23 +++++++++++++++++++++++
>>>   1 file changed, 23 insertions(+)
>>>
>>> diff --git a/qapi/machine.json b/qapi/machine.json
>>> index 58a9c86b36..f0c7e56be0 100644
>>> --- a/qapi/machine.json
>>> +++ b/qapi/machine.json
>>> @@ -1274,3 +1274,26 @@
>>>   ##
>>>   { 'event': 'MEM_UNPLUG_ERROR',
>>>     'data': { 'device': 'str', 'msg': 'str' } }
>>> +
>>> +##
>>> +# @DEVICE_UNPLUG_ERROR:
>>> +#
>>> +# Emitted when a device hot unplug error occurs.
>>> +#
>>> +# @device: device name
>>> +#
>>> +# @msg: Informative message
>>> +#
>>> +# Since: 6.1
>>> +#
>>> +# Example:
>>> +#
>>> +# <- { "event": "DEVICE_UNPLUG_ERROR"
>>> +#      "data": { "device": "dimm1",
>>> +#                "msg": "Memory hotunplug rejected by the guest for device 
>>> dimm1"
>>> +#      },
>>> +#      "timestamp": { "seconds": 1615570772, "microseconds": 202844 } }
>>> +#
>>> +##
>>> +{ 'event': 'DEVICE_UNPLUG_ERROR',
>>> +  'data': { 'device': 'str', 'msg': 'str' } }
>> 
>> Missing: update of device_add's doc comment in qdev.json:
>> 
>>      # Notes: When this command completes, the device may not be removed 
>> from the
>>      #        guest.  Hot removal is an operation that requires guest 
>> cooperation.
>>      #        This command merely requests that the guest begin the hot 
>> removal
>>      #        process.  Completion of the device removal process is signaled 
>> with a
>>      #        DEVICE_DELETED event. Guest reset will automatically complete 
>> removal
>>      #        for all devices.
>
> Ok
>
>> 
>> This sure could use some polish.
>> 
>> If I understand things correctly, we're aiming for the following device
>> unplug protocol:
>
> One thing to note is that DEVICE_UNPLUG_ERROR isn't guaranteed to be send for
> every hotunplug error. The event depends on machine/architecture support to
> detect a guest side error.

Yes.  I tried to provide for that in my description of the protocol.

>> 
>>     Unplug the device with device_del (or possibly equivalent)
>> 
>>     If we know we can't unplug the device, fail immediately.  Also emit
>>     DEVICE_UNPLUG_ERROR.
>
>
> I haven't predicted to use this event in those cases as well, although it
> seems reasonable to do so now that you mentioned it.

I think this is a matter of taste.  For what it's worth, we do emit
DEVICE_DELETED on immediate success (see next paragraph).

Emitting DEVICE_DELETED always lets QMP clients track device deletion
even when it's triggered by something else.  Feature.

No such tracking is needed for unplug failure.

>>     If possible, unplug the device synchronously and succeed.  Also emit
>>     DEVICE_DELETED.
>> 
>>     Else, initiate unplug and succeed.
>> 
>>     When unplug finishes, emit either DEVICE_DELETED or
>>     DEVICE_UNPLUG_ERROR.
>
> Since there's no 100% guarantee that DEVICE_UNPLUG_ERROR will be emitted for
> guest side errors, the wording here would be
>
> "When unplug finishes, emit DEVICE_DELETED. A DEVICE_UNPLUG_ERROR can be
> emitted if a guest side error was detected"

My assumption is that QEMU can detect asynchronous success reliably, but
not asynchronous failure.

>From QEMU's point of view, "asynchronous unplug has not finished" is
indistinguishable from "asynchronous unplug finished unsuccessfully, but
we can't detect it".  So I lumped these two cases together in the next
paragraph:

>>     For some machines and devices, unplug may never finish.

My goal is a clear description of the state machine as it can be
observed in QMP.

>> Correct?
>> 
>> Any particular reason for not putting event DEVICE_UNPLUG_ERROR next to
>> DEVICE_DELETED in qdev.json?
>
>
> Not really. I looked where MEM_UNPLUG_ERROR was declared and put it right
> after it. I can change it to qdev.json near DEVICE_DELETED.

Yes, please.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]