qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/1] quorum: Change vote rules for 64 bits ha


From: Alberto Garcia
Subject: Re: [Qemu-devel] [PATCH v2 1/1] quorum: Change vote rules for 64 bits hash
Date: Fri, 19 Feb 2016 12:24:41 +0100
User-agent: Notmuch/0.13.2 (http://notmuchmail.org) Emacs/23.2.1 (i486-pc-linux-gnu)

On Fri 19 Feb 2016 09:26:53 AM CET, Wen Congyang <address@hidden> wrote:

>>> If quorum has two children(A, B). A do flush sucessfully, but B
>>> flush failed.  We MUST choice A as winner rather than just pick
>>> anyone of them. Otherwise the filesystem of guest will become
>>> read-only with following errors:
>>>
>>> end_request: I/O error, dev vda, sector 11159960
>>> Aborting journal on device vda3-8
>>> EXT4-fs error (device vda3): ext4_journal_start_sb:327: Detected abort 
>>> journal
>>> EXT4-fs (vda3): Remounting filesystem read-only
>> 
>> Hi Xie,
>> 
>> Let's see if I'm getting this right:
>> 
>> - When Quorum flushes to disk, there's a vote among the return values of
>>   the flush operations of its members, and the one that wins is the one
>>   that Quorum returns.
>> 
>> - If there's a tie then Quorum choses the first result from the list of
>>   winners.
>> 
>> - With your patch you want to give priority to the vote with result == 0
>>   if there's any, so Quorum would return 0 (and succeed).
>> 
>> This seems to me like an ad-hoc fix for a particular use case. What
>> if you have 3 members and two of them fail with the same error code?
>> Would you still return 0 or the error code from the other two?
>
> For example:
> children.0 returns 0
> children.1 returns -EIO
> children.2 returns -EPIPE
>
> In this case, quorum returns -EPIPE now(without this patch).
>
> For example:
> children.0 returns -EPIPE
> children.1 returns -EIO
> children.2 returns 0
> In this case, quorum returns 0 now.

My question is: what's the rationale for returning 0 in case a) but not
in case b)?

  a)
    children.0 returns -EPIPE
    children.1 returns -EIO
    children.2 returns 0

  b)
    children.0 returns -EIO
    children.1 returns -EIO
    children.2 returns 0

In both cases you have one successful flush and two errors. You want to
return always 0 in case a) and always -EIO in case b). But the only
difference is that in case b) the errors happen to be the same, so why
does that matter?

That said, I'm not very convinced of the current logics of the Quorum
flush code either, so it's not even a problem with your patch... it
seems to me that the code should follow the same logics as in the
read/write case: if the number of correct flushes >= threshold then
return 0, else select the most common error code.

Berto



reply via email to

[Prev in Thread] Current Thread [Next in Thread]