qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 2.1 1/3] blockjob: Fix recent BLOCK_JOB_READY


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v2 2.1 1/3] blockjob: Fix recent BLOCK_JOB_READY regression
Date: Wed, 02 Jul 2014 08:55:49 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Paolo Bonzini <address@hidden> writes:

> Il 01/07/2014 19:08, Eric Blake ha scritto:
>> On 06/27/2014 11:24 AM, Markus Armbruster wrote:
>>> Commit bcada37 dropped the (up to now undocumented) members type, len,
>>> offset, speed, breaking tests/qemu-iotests/040 and 041.
>>>
>>> Restore and document them.  This fixes 040, and partially fixes 041.
>>>
>>> Signed-off-by: Markus Armbruster <address@hidden>
>>> Tested-By: Benoit Canet <address@hidden>
>>> ---
>>>  blockjob.c           |  6 +++++-
>>>  qapi/block-core.json | 15 ++++++++++++++-
>>>  2 files changed, 19 insertions(+), 2 deletions(-)
>>
>> Nothing wrong with this commit, but a design issue that I've recently
>> run into:
>>
>> what happens if management misses the BLOCK_JOB_COMPLETED event?  How is
>> it supposed to learn whether the job succeeded or failed?
>> 'query-blockjobs' no longer reports the job (because it is completed),
>> so all information about the job is lost.  Normally, we've tried hard to
>> make sure that all information learned from an event can also be polled

Yes.  Every time we neglect that, we find out it's a design bug later.

We should review all events for pollability, and add "how to poll"
information to their documentation.  Then enforce presence of "how to
poll" information in review.

>> (the ideal is use of events to minimize cpu overhead, but to rely on the
>> poll in situations where events may have been lost such as on a libvirtd
>> restart).
>>
>> Should we enhance job failure to be sticky, in that it not only causes
>> an event, but also remains around so that it can be reported in the next
>> 'query-blockjobs'?
>
> I think this fixes itself automatically if you use
> rerror=stop/werror=stop on block jobs.  At least that was part of the
> design, whether the implementation gets it right I cannot say without
> looking at the code more carefully.

What if an underlying device doesn't support [rw]error=stop?  Not all
do...



reply via email to

[Prev in Thread] Current Thread [Next in Thread]