qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH] qemu-img: align is_allocated_sectors to 4k


From: Peter Lieven
Subject: Re: [Qemu-block] [PATCH] qemu-img: align is_allocated_sectors to 4k
Date: Mon, 25 Jun 2018 22:29:50 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

Am 11.06.2018 um 16:04 schrieb Max Reitz:
> On 2018-06-11 15:59, Peter Lieven wrote:
>> Am 11.06.2018 um 15:30 schrieb Max Reitz:
>>> On 2018-06-07 14:46, Peter Lieven wrote:
>>>> We currently don't enforce that the sparse segments we detect during
>>>> convert are
>>>> aligned. This leads to unnecessary and costly read-modify-write
>>>> cycles either
>>>> internally in Qemu or in the background on the storage device as
>>>> nearly all
>>>> modern filesystems or hardware has a 4k alignment internally.
>>>>
>>>> As we per default set the min_sparse size to 4k it makes perfectly
>>>> sense to ensure
>>>> that these sparse holes in the file are placed at 4k boundaries.
>>>>
>>>> The number of RMW cycles when converting an example image [1] to a
>>>> raw device that
>>>> has 4k sector size is about 4600 4k read requests to perform a total
>>>> of about 15000
>>>> write requests. With this path the 4600 additional read requests are
>>>> eliminated.
>>>>
>>>> [1]
>>>> https://cloud-images.ubuntu.com/releases/16.04/release/ubuntu-16.04-server-cloudimg-amd64-disk1.vmdk
>>>>
>>>>
>>>> Signed-off-by: Peter Lieven <address@hidden>
>>>> ---
>>>>   qemu-img.c | 21 +++++++++++++++------
>>>>   1 file changed, 15 insertions(+), 6 deletions(-)
>>> I like the idea, but it doesn't seem guaranteed that
>>> is_allocated_sectors() is called on aligned offsets, so this alignment
>>> work may still leave things unaligned.
>> I can't image why this should happen. As long as the alignment devides
>> the buffer size we either
>> write or skip aligned bytes. Maybe get_block_status returns an unaligned
>> number of sectors?
> Yes, because the source medium does not need to be the same as the
> destination (so the source may have e.g. 512-byte clusters).
>
>>> Furthermore, we should probably not blindly assume 4k but instead use
>>> some block limit of the target, like pwrite_zeroes_alignment, or
>>> pdiscard_alignment, depending on the case.  (Or probably still
>>> min_sparse, if that's less.)
>>>
>>> Since is_allocated_sectors_min() (the only caller of
>>> is_allocated_sectors()) is called from just a single place, taking those
>>> factors into account should be possible.
>> I also thought of this, but for instance for raw-posix I always get a
>> request_alignment of 1.
> Yes, because request_alignment is a hard requirement.  With caching, you
> can send requests with any alignment, so it's 1.
>
> pwrite_zeroes_alignment and pdiscard_alignment are described as "Optimal
> alignment", so those should contain the values we/you want.  If they are
> 0, then you should probably fall back to opt_transfer instead of
> request_alignment.

I am still trying to figure out what is the best solution. If I take the optima 
into
account I might ending up transfering more data than necessary just to create 
an optimal
request. I just want to avoid unnecessary RMW cycles. And even if modern byte 
interfaces
advertise a request_alignment of 1 someone has to do the RMW cycle. Either the 
OS or the
harddrive itself.

I am thinking about sth like

alignment = MAX(request_alignment, opt_transfer, min_sparse)

as a starting point?

I found that opt_transfer seems to be 0 for everything I found to test.
So maybe even reduce the alignment to MAX(request_alignment, min_sparse).

Peter





reply via email to

[Prev in Thread] Current Thread [Next in Thread]