[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss
From: |
Jeff Cody |
Subject: |
Re: [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error |
Date: |
Tue, 19 Apr 2016 08:29:27 -0400 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Tue, Apr 19, 2016 at 02:27:56PM +0200, Kevin Wolf wrote:
> Am 19.04.2016 um 14:07 hat Jeff Cody geschrieben:
> > Upon receiving an I/O error after an fsync, by default gluster will
> > dump its cache. However, QEMU will retry the fsync, which is especially
> > useful when encountering errors such as ENOSPC when using the werror=stop
> > option. When using caching with gluster, however, the last written data
> > will be lost upon encountering ENOSPC. Using the write-behind-cache
> > xlator option of 'resync-failed-syncs-after-fsync' should cause gluster
> > to retain the cached data after a failed fsync, so that ENOSPC and other
> > transient errors are recoverable.
> >
> > Unfortunately, we have no way of knowing if the
> > 'resync-failed-syncs-after-fsync' xlator option is supported, so for now
> > close the fd and set the BDS driver to NULL upon fsync error.
> >
> > Signed-off-by: Jeff Cody <address@hidden>
> > ---
> > block/gluster.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> > configure | 8 ++++++++
> > 2 files changed, 50 insertions(+)
> >
> > diff --git a/block/gluster.c b/block/gluster.c
> > index d9aace6..ba33488 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -314,6 +314,23 @@ static int qemu_gluster_open(BlockDriverState *bs,
> > QDict *options,
> > goto out;
> > }
> >
> > +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> > + /* Without this, if fsync fails for a recoverable reason (for instance,
> > + * ENOSPC), gluster will dump its cache, preventing retries. This
> > means
> > + * almost certain data loss. Not all gluster versions support the
> > + * 'resync-failed-syncs-after-fsync' key value, but there is no way to
> > + * discover during runtime if it is supported (this api returns
> > success for
> > + * unknown key/value pairs) */
> > + ret = glfs_set_xlator_option(s->glfs, "*-write-behind",
> > +
> > "resync-failed-syncs-after-fsync",
> > + "on");
> > + if (ret < 0) {
> > + error_setg_errno(errp, errno, "Unable to set xlator key/value
> > pair");
> > + ret = -errno;
> > + goto out;
> > + }
> > +#endif
> > +
> > qemu_gluster_parse_flags(bdrv_flags, &open_flags);
> >
> > s->fd = glfs_open(s->glfs, gconf->image, open_flags);
> > @@ -366,6 +383,16 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState
> > *state,
> > goto exit;
> > }
> >
> > +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> > + ret = glfs_set_xlator_option(reop_s->glfs, "*-write-behind",
> > + "resync-failed-syncs-after-fsync", "on");
> > + if (ret < 0) {
> > + error_setg_errno(errp, errno, "Unable to set xlator key/value
> > pair");
> > + ret = -errno;
> > + goto exit;
> > + }
> > +#endif
> > +
> > reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags);
> > if (reop_s->fd == NULL) {
> > /* reops->glfs will be cleaned up in _abort */
> > @@ -613,6 +640,21 @@ static coroutine_fn int
> > qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
> >
> > ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
> > if (ret < 0) {
> > + /* Some versions of Gluster (3.5.6 -> 3.5.8?) will not retain its
> > + * cache after a fsync failure, so we have no way of allowing the
> > guest
> > + * to safely continue. Gluster versions prior to 3.5.6 don't
> > retain
> > + * the cache either, but will invalidate the fd on error, so this
> > is
> > + * again our only option.
> > + *
> > + * The 'resync-failed-syncs-after-fsync' xlator option for the
> > + * write-behind cache will cause later gluster versions to retain
> > + * its cache after error, so long as the fd remains open. However,
> > + * we currently have no way of knowing if this option is supported.
> > + *
> > + * TODO: Once gluster provides a way for us to determine if the
> > option
> > + * is supported, bypass the closure and setting drv to NULL.
> > */
> > + qemu_gluster_close(bs);
> > + bs->drv = NULL;
> > return -errno;
> > }
>
> More context:
>
> qemu_coroutine_yield();
> return acb.ret;
> }
>
> I would guess that acb.ret containing an error is the more common case.
> We should probably invalidate the BDS in both cases (immediate failure
> and callback with error code).
>
Ah yes, indeed. I'll do that now.
- [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster, Jeff Cody, 2016/04/19
- [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error, Jeff Cody, 2016/04/19
- [Qemu-devel] [PATCH for-2.6 v2 1/3] block/gluster: return correct error value, Jeff Cody, 2016/04/19
- [Qemu-devel] [PATCH for-2.6 v2 2/3] block/gluster: code movement of qemu_gluster_close(), Jeff Cody, 2016/04/19
- Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster, Ric Wheeler, 2016/04/19
- Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster, Jeff Cody, 2016/04/19
- Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster, Ric Wheeler, 2016/04/19
- Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster, Kevin Wolf, 2016/04/20
- Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster, Ric Wheeler, 2016/04/20
- Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster, Kevin Wolf, 2016/04/20
- Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster, Rik van Riel, 2016/04/20
- Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster, Kevin Wolf, 2016/04/21