qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] block/rbd: fix write zeroes with growing images


From: Stefano Garzarella
Subject: Re: [PATCH] block/rbd: fix write zeroes with growing images
Date: Mon, 21 Mar 2022 09:17:35 +0100

On Sat, Mar 19, 2022 at 02:23:18PM +0100, Ilya Dryomov wrote:
On Sat, Mar 19, 2022 at 1:40 PM Ilya Dryomov <idryomov@gmail.com> wrote:

On Fri, Mar 18, 2022 at 9:25 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
>
> On Thu, Mar 17, 2022 at 07:27:05PM +0100, Peter Lieven wrote:
> >
> >
> >> Am 17.03.2022 um 17:26 schrieb Stefano Garzarella <sgarzare@redhat.com>:
> >>
> >> Commit d24f80234b ("block/rbd: increase dynamically the image size")
> >> added a workaround to support growing images (eg. qcow2), resizing
> >> the image before write operations that exceed the current size.
> >>
> >> We recently added support for write zeroes and without the
> >> workaround we can have problems with qcow2.
> >>
> >> So let's move the resize into qemu_rbd_start_co() and do it when
> >> the command is RBD_AIO_WRITE or RBD_AIO_WRITE_ZEROES.
> >>
> >> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2020993
> >> Fixes: c56ac27d2a ("block/rbd: add write zeroes support")
> >> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> >> ---
> >> block/rbd.c | 26 ++++++++++++++------------
> >> 1 file changed, 14 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/block/rbd.c b/block/rbd.c
> >> index 8f183eba2a..6caf35cbba 100644
> >> --- a/block/rbd.c
> >> +++ b/block/rbd.c
> >> @@ -1107,6 +1107,20 @@ static int coroutine_fn 
qemu_rbd_start_co(BlockDriverState *bs,
> >>
> >>     assert(!qiov || qiov->size == bytes);
> >>
> >> +    if (cmd == RBD_AIO_WRITE || cmd == RBD_AIO_WRITE_ZEROES) {
> >> +        /*
> >> +         * RBD APIs don't allow us to write more than actual size, so in 
order
> >> +         * to support growing images, we resize the image before write
> >> +         * operations that exceed the current size.
> >> +         */
> >> +        if (offset + bytes > s->image_size) {
> >> +            int r = qemu_rbd_resize(bs, offset + bytes);
> >> +            if (r < 0) {
> >> +                return r;
> >> +            }
> >> +        }
> >> +    }
> >> +
> >>     r = rbd_aio_create_completion(&task,
> >>                                   (rbd_callback_t) qemu_rbd_completion_cb, 
&c);
> >>     if (r < 0) {
> >> @@ -1182,18 +1196,6 @@ coroutine_fn qemu_rbd_co_pwritev(BlockDriverState 
*bs, int64_t offset,
> >>                                  int64_t bytes, QEMUIOVector *qiov,
> >>                                  BdrvRequestFlags flags)
> >> {
> >> -    BDRVRBDState *s = bs->opaque;
> >> -    /*
> >> -     * RBD APIs don't allow us to write more than actual size, so in order
> >> -     * to support growing images, we resize the image before write
> >> -     * operations that exceed the current size.
> >> -     */
> >> -    if (offset + bytes > s->image_size) {
> >> -        int r = qemu_rbd_resize(bs, offset + bytes);
> >> -        if (r < 0) {
> >> -            return r;
> >> -        }
> >> -    }
> >>     return qemu_rbd_start_co(bs, offset, bytes, qiov, flags, 
RBD_AIO_WRITE);
> >> }
> >>
> >> --
> >> 2.35.1
> >>
> >
> >Do we really have a use case for growing rbd images?
>
> The use case is to have a qcow2 image on rbd.
> I don't think it's very common, but some people use it and here [1] we
> had a little discussion about features that could be interesting (e.g.
> persistent dirty bitmaps for incremental backup).

RBD supports that natively (object-map and fast-diff image features).
The granularity is higher than a typical QCOW2 granularity (64K vs 4M)
but I have never heard of that being a concern.

Sorry, I meant to say lower (more coarse grained) above.

RBD's default object size is 4M and that is the granularity at which
dirtiness is typically tracked.  It is possible to ask for a byte-level
incremental diff but it is obviously slow.

I didn't know that, thanks for letting me know!

Stefano




reply via email to

[Prev in Thread] Current Thread [Next in Thread]