qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/1] quorum: Change vote rules for 64 bits ha


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH v2 1/1] quorum: Change vote rules for 64 bits hash
Date: Mon, 22 Feb 2016 09:02:55 +0000
User-agent: Mutt/1.5.24 (2015-08-30)

* Changlong Xie (address@hidden) wrote:
> On 02/20/2016 10:28 PM, Max Reitz wrote:
> >On 19.02.2016 12:24, Alberto Garcia wrote:
> >>On Fri 19 Feb 2016 09:26:53 AM CET, Wen Congyang <address@hidden> wrote:
> >>
> >>>>>If quorum has two children(A, B). A do flush sucessfully, but B
> >>>>>flush failed.  We MUST choice A as winner rather than just pick
> >>>>>anyone of them. Otherwise the filesystem of guest will become
> >>>>>read-only with following errors:
> >>>>>
> >>>>>end_request: I/O error, dev vda, sector 11159960
> >>>>>Aborting journal on device vda3-8
> >>>>>EXT4-fs error (device vda3): ext4_journal_start_sb:327: Detected abort 
> >>>>>journal
> >>>>>EXT4-fs (vda3): Remounting filesystem read-only
> >>>>
> >>>>Hi Xie,
> >>>>
> >>>>Let's see if I'm getting this right:
> >>>>
> >>>>- When Quorum flushes to disk, there's a vote among the return values of
> >>>>   the flush operations of its members, and the one that wins is the one
> >>>>   that Quorum returns.
> >>>>
> >>>>- If there's a tie then Quorum choses the first result from the list of
> >>>>   winners.
> >>>>
> >>>>- With your patch you want to give priority to the vote with result == 0
> >>>>   if there's any, so Quorum would return 0 (and succeed).
> >>>>
> >>>>This seems to me like an ad-hoc fix for a particular use case. What
> >>>>if you have 3 members and two of them fail with the same error code?
> >>>>Would you still return 0 or the error code from the other two?
> >>>
> >>>For example:
> >>>children.0 returns 0
> >>>children.1 returns -EIO
> >>>children.2 returns -EPIPE
> >>>
> >>>In this case, quorum returns -EPIPE now(without this patch).
> >>>
> >>>For example:
> >>>children.0 returns -EPIPE
> >>>children.1 returns -EIO
> >>>children.2 returns 0
> >>>In this case, quorum returns 0 now.
> >>
> >>My question is: what's the rationale for returning 0 in case a) but not
> >>in case b)?
> >>
> >>   a)
> >>     children.0 returns -EPIPE
> >>     children.1 returns -EIO
> >>     children.2 returns 0
> >>
> >>   b)
> >>     children.0 returns -EIO
> >>     children.1 returns -EIO
> >>     children.2 returns 0
> >>
> >>In both cases you have one successful flush and two errors. You want to
> >>return always 0 in case a) and always -EIO in case b). But the only
> >>difference is that in case b) the errors happen to be the same, so why
> >>does that matter?
> >>
> >>That said, I'm not very convinced of the current logics of the Quorum
> >>flush code either, so it's not even a problem with your patch... it
> >>seems to me that the code should follow the same logics as in the
> >>read/write case: if the number of correct flushes >= threshold then
> >>return 0, else select the most common error code.
> >
> >I'm not convinced of the logic either, which is why I waited for you to
> >respond to this patch. :-)
> >
> >Intuitively, I'd expect Quorum to return an error if flushing failed for
> >any of the children, because, well, flushing failed. I somehow feel like
> >flushing is different from a read or write operation and therefore
> >ignoring the threshold would be fine here. However, maybe my intuition
> >is just off.
> >
> >Anyway, regardless of that, if we do take the threshold into account, we
> >should not use the exact error value for voting but just whether an
> >error occurred or not. If all but one children fail to flush (all for
> >different reasons), I find it totally wrong to return success. We should
> >then just return -EIO or something.
> >
> Hi Berto & Max
> 
> Thanks for your comments, i'd like to have a summary here. For flush cases:
> 
> 1) if flush successfully(result >= 0), result = 0; else if result < 0,
> result = -EIO. then invoke quorum_count_vote
> 2) if correct flushes >= threshold, mark correct flushes as winner directly.

I find it difficult to understand how this corresponds to the behaviour needed
in COLO, where we have the NBD and the real storage on the primary; in that
case the failure of the real storage should give an error to the guest, but the
failure of the NBD shouldn't produce a guest visible failure.

Dave

> Will fix in next version.
> 
> Thanks
>       -Xie
> >Max
> >
> 
> 
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]