qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v2 2/3] qemu-img: rebase: Reduce reads on in-cha


From: Max Reitz
Subject: Re: [Qemu-block] [PATCH v2 2/3] qemu-img: rebase: Reduce reads on in-chain rebase
Date: Thu, 23 May 2019 16:26:55 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

On 23.05.19 16:09, Sam Eiderman wrote:
> 
> 
>> On 23 May 2019, at 17:01, Max Reitz <address@hidden
>> <mailto:address@hidden>> wrote:
>>
>> On 02.05.19 15:58, Sam Eiderman wrote:
>>> In the following case:
>>>
>>> (base) A <- B <- C (tip)
>>>
>>> when running:
>>>
>>>    qemu-img rebase -b A C
>>>
>>> QEMU would read all sectors not allocated in the file being rebased (C)
>>> and compare them to the new base image (A), regardless of whether they
>>> were changed or even allocated anywhere along the chain between the new
>>> base and the top image (B). This causes many unneeded reads when
>>> rebasing an image which represents a small diff of a large disk, as it
>>> would read most of the disk's sectors.
>>>
>>> Instead, use bdrv_is_allocated_above() to reduce the number of
>>> unnecessary reads.
>>>
>>> Reviewed-by: Karl Heubaum <address@hidden
>>> <mailto:address@hidden>>
>>> Signed-off-by: Sam Eiderman <address@hidden
>>> <mailto:address@hidden>>
>>> Signed-off-by: Eyal Moscovici <address@hidden
>>> <mailto:address@hidden>>
>>> ---
>>> qemu-img.c | 25 ++++++++++++++++++++++++-
>>> 1 file changed, 24 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qemu-img.c b/qemu-img.c
>>> index d9b609b3f0..7f20858cb9 100644
>>> --- a/qemu-img.c
>>> +++ b/qemu-img.c
>>
>> [...]
>>
>>> @@ -3422,6 +3428,23 @@ static int img_rebase(int argc, char **argv)
>>>                 continue;
>>>             }
>>>
>>> +            if (prefix_chain_bs) {
>>> +                /*
>>> +                 * If cluster wasn't changed since prefix_chain, we
>>> don't need
>>> +                 * to take action
>>> +                 */
>>> +                ret = bdrv_is_allocated_above(bs, prefix_chain_bs,
>>> +                                              offset, n, &n);
>>
>> This will always return true because it definitely is allocated in @bs,
>> or we wouldn’t be here.  (We just checked that with
>> bdrv_is_allocated().)  I think @top should be backing_bs(bs).
>>
>> Max
> 
> I don’t think that’s true:
> 
> Examine the case where we have the following chain:
> 
> A <- B <- C
> 
> When we rebase C directly over A: qemu-img rebase -b A C
> 
> We must check for every offset (sector): bdrv_is_allocated_above(C, A,
> offset, n, &n);
> 
> If a sector from C is allocated above A - it may have been changed - so
> we need to do a read from A and a read from C and compare.
> If the sector is not allocated above, it was not changed - we don’t need
> to read from A or C.

First: Oops, somehow I inverted the bdrv_is_allocated() check in my
head.  (For context: I mean this part above this hunk here:

    /* If the cluster is allocated, we don't need to take action */
    ret = bdrv_is_allocated(bs, offset, n, &n);
    if (ret < 0) {
        error_report("error while reading image metadata: %s",
                     strerror(-ret));
        goto out;
    }



    if (ret) {
        continue;
    }

)  So at this point, the range definitely is *not* allocated in @bs.

But second: That still means that we do not have to check @bs itself,
because we already did.  We know the range isn’t allocated there, so we
can start at its backing file.

On a more abstract level: No, we do not need to read all sectors from A
and C and compare them if they are allocated anywhere above A.  If they
are allocated in C, we’re good, because all we’d do is write them back
to C (which is a no-op).  That’s exactly what the existing
bdrv_is_allocated() check is for.

So we only need to know whether the sectors are allocated above the base
(A) and below the top (C), so in your example, whether they are
allocated in B.  If they are, we need to compare and potentially copy,
if they are not, we can skip them.

So my claim that bdrv_is_allocated_above() would always return true is
wrong, but it still should use backing_bs(bs) for the top because we
have checked @bs already.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]