[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/2] block/rbd: fix handling of holes in .bdrv_co_block_statu
From: |
Peter Lieven |
Subject: |
Re: [PATCH 1/2] block/rbd: fix handling of holes in .bdrv_co_block_status |
Date: |
Wed, 12 Jan 2022 22:27:54 +0100 |
> Am 12.01.2022 um 22:06 schrieb Ilya Dryomov <idryomov@gmail.com>:
>
> On Wed, Jan 12, 2022 at 9:39 PM Peter Lieven <pl@kamp.de> wrote:
>>
>>> Am 12.01.22 um 10:05 schrieb Ilya Dryomov:
>>> On Mon, Jan 10, 2022 at 12:42 PM Peter Lieven <pl@kamp.de> wrote:
>>>> the assumption that we can't hit a hole if we do not diff against a
>>>> snapshot was wrong.
>>>>
>>>> We can see a hole in an image if we diff against base if there exists an
>>>> older snapshot
>>>> of the image and we have discarded blocks in the image where the snapshot
>>>> has data.
>>>>
>>>> Fixes: 0347a8fd4c3faaedf119be04c197804be40a384b
>>>> Cc: qemu-stable@nongnu.org
>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>> ---
>>>> block/rbd.c | 55 +++++++++++++++++++++++++++++++++--------------------
>>>> 1 file changed, 34 insertions(+), 21 deletions(-)
>>>>
>>>> diff --git a/block/rbd.c b/block/rbd.c
>>>> index def96292e0..5e9dc91d81 100644
>>>> --- a/block/rbd.c
>>>> +++ b/block/rbd.c
>>>> @@ -1279,13 +1279,24 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs,
>>>> size_t len,
>>>> RBDDiffIterateReq *req = opaque;
>>>>
>>>> assert(req->offs + req->bytes <= offs);
>>>> - /*
>>>> - * we do not diff against a snapshot so we should never receive a
>>>> callback
>>>> - * for a hole.
>>>> - */
>>>> - assert(exists);
>>>>
>>>> - if (!req->exists && offs > req->offs) {
>>>> + if (req->exists && offs > req->offs + req->bytes) {
>>>> + /*
>>>> + * we started in an allocated area and jumped over an unallocated
>>>> area,
>>>> + * req->bytes contains the length of the allocated area before the
>>>> + * unallocated area. stop further processing.
>>>> + */
>>>> + return QEMU_RBD_EXIT_DIFF_ITERATE2;
>>>> + }
>>>> + if (req->exists && !exists) {
>>>> + /*
>>>> + * we started in an allocated area and reached a hole. req->bytes
>>>> + * contains the length of the allocated area before the hole.
>>>> + * stop further processing.
>>>> + */
>>>> + return QEMU_RBD_EXIT_DIFF_ITERATE2;
>>>> + }
>>>> + if (!req->exists && exists && offs > req->offs) {
>>>> /*
>>>> * we started in an unallocated area and hit the first allocated
>>>> * block. req->bytes must be set to the length of the unallocated
>>>> area
>>>> @@ -1295,17 +1306,19 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs,
>>>> size_t len,
>>>> return QEMU_RBD_EXIT_DIFF_ITERATE2;
>>>> }
>>>>
>>>> - if (req->exists && offs > req->offs + req->bytes) {
>>>> - /*
>>>> - * we started in an allocated area and jumped over an unallocated
>>>> area,
>>>> - * req->bytes contains the length of the allocated area before the
>>>> - * unallocated area. stop further processing.
>>>> - */
>>>> - return QEMU_RBD_EXIT_DIFF_ITERATE2;
>>>> - }
>>>> + /*
>>>> + * assert that we caught all cases above and allocation state has not
>>>> + * changed during callbacks.
>>>> + */
>>>> + assert(exists == req->exists || !req->bytes);
>>>> + req->exists = exists;
>>>>
>>>> - req->bytes += len;
>>>> - req->exists = true;
>>>> + /*
>>>> + * assert that we either return an unallocated block or have got
>>>> callbacks
>>>> + * for all allocated blocks present.
>>>> + */
>>>> + assert(!req->exists || offs == req->offs + req->bytes);
>>>> + req->bytes = offs + len - req->offs;
>>>>
>>>> return 0;
>>>> }
>>>> @@ -1354,13 +1367,13 @@ static int coroutine_fn
>>>> qemu_rbd_co_block_status(BlockDriverState *bs,
>>>> }
>>>> assert(req.bytes <= bytes);
>>>> if (!req.exists) {
>>>> - if (r == 0) {
>>>> + if (r == 0 && !req.bytes) {
>>>> /*
>>>> - * rbd_diff_iterate2 does not invoke callbacks for unallocated
>>>> - * areas. This here catches the case where no callback was
>>>> - * invoked at all (req.bytes == 0).
>>>> + * rbd_diff_iterate2 does not invoke callbacks for
>>>> unallocated areas
>>>> + * except for the case where an overlay has a hole where the
>>>> parent
>>>> + * or an older snapshot of the image has not. This here
>>>> catches the
>>>> + * case where no callback was invoked at all.
>>>> */
>>>> - assert(req.bytes == 0);
>>>> req.bytes = bytes;
>>>> }
>>>> status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
>>>> --
>>>> 2.25.1
>>>>
>>>>
>>> Hi Peter,
>>>
>>> Can we just skip these "holes" by replacing the existing assert with
>>> an if statement that would simply bail from the callback on !exists?
>>>
>>> Just trying to keep the logic as simple as possible since as it turns
>>> out we get to contend with ages-old librbd bugs here...
>>
>>
>> I'm afraid I think this would not work. Consider qemu-img convert.
>>
>> If we bail out we would immediately call get_block_status with the offset
>>
>> where we stopped and hit the !exist again.
>
> I'm suggesting bailing from the callback (i.e. return 0), not from the
> entire rbd_diff_iterate2() instance. The iteration would move on and
> either stumble upon an allocated area within the requested range or run
> off the end of the requested range. Both of these cases are already
> handled by the existing code.
Ah, got it. That’s a smart solution!
Peter
- Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784, (continued)
- Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784, Ilya Dryomov, 2022/01/12
- Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784, Peter Lieven, 2022/01/12
- Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784, Ilya Dryomov, 2022/01/12
- Message not available
- Message not available
- Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784, Peter Lieven, 2022/01/12
- Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784, Ilya Dryomov, 2022/01/12
- Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784, Peter Lieven, 2022/01/12
[PATCH 1/2] block/rbd: fix handling of holes in .bdrv_co_block_status, Peter Lieven, 2022/01/10