Re: [PATCH] block/rbd: fix write zeroes with growing images

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] block/rbd: fix write zeroes with growing images

From:	Hanna Reitz
Subject:	Re: [PATCH] block/rbd: fix write zeroes with growing images
Date:	Thu, 24 Mar 2022 12:06:36 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0

On 24.03.22 11:42, Peter Lieven wrote:

Am 24.03.22 um 11:40 schrieb Stefano Garzarella:
On Thu, Mar 24, 2022 at 10:52:04AM +0100, Peter Lieven wrote:
Am 22.03.22 um 10:38 schrieb Hanna Reitz:
On 21.03.22 09:31, Stefano Garzarella wrote:
On Sat, Mar 19, 2022 at 04:15:33PM +0100, Peter Lieven wrote:
Am 18.03.2022 um 17:47 schrieb Stefano Garzarella<sgarzare@redhat.com>:
On Fri, Mar 18, 2022 at 04:48:18PM +0100, Peter Lieven wrote:
Am 18.03.2022 um 09:25 schrieb Stefano Garzarella<sgarzare@redhat.com>:
On Thu, Mar 17, 2022 at 07:27:05PM +0100, Peter Lieven wrote:
Am 17.03.2022 um 17:26 schrieb Stefano Garzarella<sgarzare@redhat.com>:
Commit d24f80234b ("block/rbd: increase dynamically theimage size")added a workaround to support growing images (eg. qcow2),resizing
the image before write operations that exceed the current size.

We recently added support for write zeroes and without the
workaround we can have problems with qcow2.
So let's move the resize into qemu_rbd_start_co() and do itwhen
the command is RBD_AIO_WRITE or RBD_AIO_WRITE_ZEROES.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2020993
Fixes: c56ac27d2a ("block/rbd: add write zeroes support")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
block/rbd.c | 26 ++++++++++++++------------
1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 8f183eba2a..6caf35cbba 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1107,6 +1107,20 @@ static int coroutine_fnqemu_rbd_start_co(BlockDriverState *bs,
  assert(!qiov || qiov->size == bytes);

+    if (cmd == RBD_AIO_WRITE || cmd == RBD_AIO_WRITE_ZEROES) {
+        /*
+ * RBD APIs don't allow us to write more thanactual size, so in order+ * to support growing images, we resize the imagebefore write
+         * operations that exceed the current size.
+         */
+        if (offset + bytes > s->image_size) {
+            int r = qemu_rbd_resize(bs, offset + bytes);
+            if (r < 0) {
+                return r;
+            }
+        }
+    }
+
  r = rbd_aio_create_completion(&task,
(rbd_callback_t) qemu_rbd_completion_cb, &c);
  if (r < 0) {
@@ -1182,18 +1196,6 @@ coroutine_fnqemu_rbd_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, QEMUIOVector*qiov,
BdrvRequestFlags flags)
{
-    BDRVRBDState *s = bs->opaque;
-    /*
- * RBD APIs don't allow us to write more than actualsize, so in order- * to support growing images, we resize the imagebefore write
-     * operations that exceed the current size.
-     */
-    if (offset + bytes > s->image_size) {
-        int r = qemu_rbd_resize(bs, offset + bytes);
-        if (r < 0) {
-            return r;
-        }
-    }
return qemu_rbd_start_co(bs, offset, bytes, qiov, flags,RBD_AIO_WRITE);
}

-- 2.35.1
Do we really have a use case for growing rbd images?
The use case is to have a qcow2 image on rbd.
I don't think it's very common, but some people use it andhere [1] we had a little discussion about features that couldbe interesting (e.g. persistent dirty bitmaps for incrementalbackup).
In any case the support is quite simple and does not affectother use cases since we only increase the size when we gobeyond the current size.
IMHO we can have it in :-)
The QCOW2 alone doesn’t make much sense, but additionalmetadata might be a use case.
Yep.
Be aware that the current approach will serialize requests. Ifthere is a real use case, we might think of a better solution.
Good point, but it only happens when we have to resize, so maybeit's okay for now, but I agree we could do better ;-)
There might also be a problem if a write for a higher offset pasteof will be executed shortly before a write to a slightly loweroffset past eof. The second resize will fail as it would shrinkthe image. We would need proper locking to avoid this. Maybe weneed to check if we write past eof. If yes, take a lock aroundthe resize op and then check again if it’s still eof and onlyresize if true.
I thought rbd_resize() was synchronous. Indeed when you said thiscould serialize writes it sounded like confirmation to me.
Since we call rbd_resize() before rbd_aio_writev(), I thought thiscase could not occur.
Can you please elaborate?
Seconding this request, because if rbd_resize() is allowed toshrink data, it being asynchronous might cause data corruption.
I’ll keep your patch because I find this highly unlikely, though:qemu_rbd_resize() itself is definitely synchronous, it can’t invokeqemu_coroutine_yield().
The only other possibility that comes to my mind is thatrbd_resize() might delay the actual resize operation, but I wouldstill expect consecutive resize requests to be executed in order,and since we call rbd_aio_writev()/rbd_aio_write_zeroes()immediately after the rbd_resize() (with no yielding in between),everything should be executed in the order that we expect.
Maybe my assumption of parallelism here was wrong. I was thinking of:


Request A: write at offset (EOL + 4k).

Request A: rbd_resize is invoked (size EOL + 4k)
IIUC Request B can't start until Request A callsqemu_coroutine_yield(), but I'm waiting for a confirmation from Hanna :-)


That’s my impression at least.

Yes, and I would be interested if this is also true if coroutines areimplemented as threads.

Depends on what you mean by that. Coroutines are a form of cooperativemultitasking, i.e. they can’t be preempted unless they explicitlyyield. Threads are generally supposed to be preemptive, so those arejust different things.

Of course you can use coroutines in threads, i.e. run multiple requestsin parallel. But then the coroutine part becomes largely irrelevant,and you’re just facing standard thread-safety questions, and then ofcourse this won’t be safe. I assume to support such a model, all blockdrivers would need to be fully audited anyway, though.

For example, theoretically, the guest could then issue two resizeoperations simultaneously, and qemu_rbd_co_truncate() would be called intwo concurrent threads. This would already cause problems, becausesetting s->image_size would race. That’s pre-existing regardless ofthis patch here (or d24f80234b39d2d5c0d91e63b5e4569d37b2399e).

What this means is that of course we could just slap a lock around theqemu_rbd_resize() call in qemu_rbd_start_co() (and its surroundingcondition), it wouldn’t cost anything, assuming that this area can’t berun in parallel anyway. But the rest of the block driver doesn’tcontain a single lock yet, which to me signals that nothing inblock/rbd.c is thread-safe anyway.


Hanna

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH] block/rbd: fix write zeroes with growing images, (continued)
- Re: [PATCH] block/rbd: fix write zeroes with growing images, Peter Lieven, 2022/03/17
  - Re: [PATCH] block/rbd: fix write zeroes with growing images, Stefano Garzarella, 2022/03/18
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Peter Lieven, 2022/03/18
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Stefano Garzarella, 2022/03/18
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Peter Lieven, 2022/03/19
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Stefano Garzarella, 2022/03/21
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Hanna Reitz, 2022/03/22
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Peter Lieven, 2022/03/24
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Stefano Garzarella, 2022/03/24
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Peter Lieven, 2022/03/24
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Hanna Reitz <=
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Peter Lieven, 2022/03/24
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Ilya Dryomov, 2022/03/19
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Ilya Dryomov, 2022/03/19
    - Re: [PATCH] block/rbd: fix write zeroes with growing images, Stefano Garzarella, 2022/03/21
- Re: [PATCH] block/rbd: fix write zeroes with growing images, Hanna Reitz, 2022/03/18
- Re: [PATCH] block/rbd: fix write zeroes with growing images, Ilya Dryomov, 2022/03/19

Prev by Date: Re: [PATCH-for-7.0 0/2] misc: Fix misleading hexadecimal format
Next by Date: Re: [PATCH] iotests: update test owner contact information
Previous by thread: Re: [PATCH] block/rbd: fix write zeroes with growing images
Next by thread: Re: [PATCH] block/rbd: fix write zeroes with growing images
Index(es):
- Date
- Thread