qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 13/47] block: introduce block job error


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH 13/47] block: introduce block job error
Date: Wed, 01 Aug 2012 15:21:13 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1

Il 01/08/2012 15:09, Kevin Wolf ha scritto:
>> It's not about source vs. target, it's about what block device the _job_
>> is attached too.  The source is always going to be special because you
>> must do "block-job-resume source".
> 
> This whole concept of a block job being attached to a single bs is
> flawed. Because in the general case it isn't. And probably it should
> have been a BackgroundJob rather than a BlockJob from the beginning,
> because I'm sure there are use cases outside the block layer as well.
> 
> We really manage to get every single point messed up. :-(

The balance between overengineering and messing up is unfortunately very
difficult to strike.  Yes, migration is a BackgroundJob in itself.  But
how far would we want to go specializing the monitor interface for
BackgroundJob subclasses, considering after all this is C?

If we had a tiny C core and everything above it written in {dynamic
language of the day} (I vote for Smalltalk, FYI) perhaps we could be
more cavalier in creating new classes and management abstractions, but
keeping it simple has its advantages.

No matter how much we messed things up, so far we decently managed to
refactor our way out (and screw up even more the next thing).

>> is_read tells you where the error was.
> 
> Rather indirect, but yes, for the mirror it works.
> 
>>> Also, if you miss the event (e.g. libvirt crashed), can you still tell
>>> which bs caused the error?
>>
>> No, but how is it different from "can you still tell which bs in the
>> chain caused the error"?  Which you cannot tell anyway even from the
>> event parameters we have had so far.
> 
> It isn't different. We really should report the exact BDS. In practice,
> management tools care about ENOSPC which is always the top-level BDS,
> and call the admin for help in other cases. The admin can try manually
> which file is at fault, but I suppose he would be very glad to be told
> by the error message which file it is.

Yes, and we can add it in the future as an additional argument to the
event and of query-{block,block-jobs} output.  At which point we are ok
with having the target iostatus in BlockJob.

>>> Do we need another BlockJobInfo field for the
>>> name (*cough* ;-)) of the failed bs?
>>>
>>> If I understand it right, error reporting on the source and on the
>>> target would be completely symmetrical then, which I think is a good thing.
>>
>> Yeah, it would, but job->bs _is_ special anyway.
>>
>>> job->bs makes one image special, which isn't really useful for a generic
>>> interface. So we should keep the fact that it exists internal and get
>>> rid of it sometime.
>>>
>>>> +    if (action == BDRV_ACTION_STOP) {
>>>> +        block_job_pause(job);
>>>> +        block_job_iostatus_set_err(job, error);
>>>> +        if (bs != job->bs) {
>>>> +            bdrv_iostatus_set_err(bs, error);
>>>> +        }
>>>> +    }
>>>>
>>>> where the bdrv_iostatus_set_err is mostly to "prepare for the future"
>>>> usage of named block devices.
>>>
>>> Again, I'd make it unconditional to get symmetric behaviour.
>>
>> Not possible, because existing clients may expect "iostatus:
>> {nospace,failed}" to imply a runstate of "not running (i/o error)".
> 
> Did we make such guarantees? Does libvirt actually make use of it?

I'm not sure libvirt relies on it, but I think it's a reasonable
expectation.

>>>> Hmm, but I'm a bit wary of introducing such a big change now.  We know
>>>> what it makes nicer, but we don't know of anything irremediably broken
>>>> without them, and we haven't thought enough of any warts it introduces.
>>>
>>> On the one hand, I can understand your concerns, but on the other hand,
>>> introducing an API now and then making such a big change afterwards
>>> scares me much more.
>>
>> One example of the doubts I have: is iostatus a BlockBackend or a
>> BlockDriverState thing?  If it a BlockBackend thing, does the target
>> device have an iostatus at all?
> 
> I think it's better to have it in BlockDriverState, but in my
> imagination the target is a BlockBackend anyway.

Great, 0/2. :)  My rhetorical questions didn't have the outcome I hoped for.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]