[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 06/16] quorum: allow ignoring child errors
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [PATCH 06/16] quorum: allow ignoring child errors |
Date: |
Mon, 7 Sep 2015 17:56:39 +0100 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
* Eric Blake (address@hidden) wrote:
> On 09/02/2015 02:51 AM, Wen Congyang wrote:
> > If the child is not ready, read/write/getlength/flush will
> > return -errno. It is not critical error, and can be ignored:
> > 1. read/write:
> > Just not report the error event.
>
> What happens if all the children report an error? Or is the threshold
> at play here?
I think it's interesting because in the COLO case the intention isn't
really about a threshold (in the way you might use for RAID or mirroring),
it's that one of the stores is local (and not expected to error) and one
is somewhere over a network, so if it fails you don't want to stop
the local VM working.
However, if it fails we do need to know about it; if any write to
the secondary stops then the fault-tolerance has failed (at least for
that drive); so we should do *something* - I'm not sure what though.
Dave
> For example, if you have a threshold of 3/5, then I'm assuming that if
> up to two children return an errno, then it is okay to ignore; but if
> three or more return an errno, you haven't met threshold, so the I/O
> must fail.
>
> Are you ignoring ALL errors (including things like EACCES), or just EIO
> errors?
>
>
> > 2. getlength:
> > just ignore it. If all children's getlength return -errno,
> > and be ignored, return -EIO.
> > 3. flush:
> > Just ignore it. If all children's getlength return -errno,
>
> s/getlength/flush/
>
> > and be ignored, return 0.
>
> Yuck - claiming success when all of the children fail feels dangerous.
>
> >
> > Usage: children.x.ignore-errors=true
> >
> > Signed-off-by: Wen Congyang <address@hidden>
> > Signed-off-by: zhanghailiang <address@hidden>
> > Signed-off-by: Gonglei <address@hidden>
> > Cc: Alberto Garcia <address@hidden>
> > ---
> > block/quorum.c | 94
> > ++++++++++++++++++++++++++++++++++++++++++++++++----
>
> Interface review only:
>
> > +++ b/qapi/block-core.json
> > @@ -1411,6 +1411,8 @@
> > # @allow-write-backing-file: #optional whether the backing file is opened
> > in
> > # read-write mode. It is only for backing file
> > # (Since 2.5 default: false)
> > +# @ignore-errors: #options whether the child's I/O error should be ignored.
>
> s/options/optional/
> s/error/errors/
>
> > +# it is only for quorum's child.(Since 2.5 default: false)
>
> Space after '.' in English sentences.
>
> The fact that you are documenting that this option can only be specified
> for quorum children makes me wonder if it belongs better as an option in
> BlockdevOptionsQuorum rather than BlockdevOptionsBase.
>
> Semantically, it sounds like you are trying to allow for a per-child
> decision of whether this particular child's errors matter to the overall
> quorum. So, if we have a 3/5 quorum, we can decide that for children A,
> B, C, and D, errors cannot be ignored, but for child E, errors are not a
> problem.
>
> As written, you are tying the semantics to each child BDS, and requiring
> special code to double-check that the property is only ever set if the
> BDS is used as the child of a quorum. Furthermore, if the property is
> set, you are then changing what the child does in response to various
> operations.
>
> What if you instead create a list property in the quorum parent? Maybe
> along the lines of:
>
> # @child-errors-okay: #optional an array of named-node children where
> errors will be ignored (Since 2.5, default empty)
>
> { 'struct': 'BlockdevOptionsQuorum',
> 'data': { '*blkverify': 'bool',
> 'children': [ 'BlockdevRef' ],
> 'vote-threshold': 'int',
> '*rewrite-corrupted': 'bool',
> '*read-pattern': 'QuorumReadPattern',
> '*child-errors-okay': ['str'] } }
>
> The above example of a 3/5 quorum, where only child E can ignore errors,
> would then be represented as:
>
> { "children": [ "A", "B", "C", "D", "E" ], 'vote-threshold':3,
> 'child-errors-okay': [ "E" ] }
>
> The code to ignore the errors is then done in the quorum itself (the BDS
> for E does not have to care about a special ignore-errors property, but
> just always returns the error as usual; and then the quorum is deciding
> how to handle the error), and you are not polluting the BDS state for
> something that is quorum-specific, because it is now the quorum itself
> that tracks the special casing.
>
> Finally, why can't hot-plug/unplug of quorum members work? If you are
> going to always ignore errors from a particular child, then why is that
> child even part of the quorum? Isn't a better design to just not add
> the child to the quorum until it is ready and won't be reporting errors?
>
> --
> Eric Blake eblake redhat com +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK