[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Qemu-stable] [PATCH 2/2] block: Pass unaligned discard
From: |
Eric Blake |
Subject: |
Re: [Qemu-devel] [Qemu-stable] [PATCH 2/2] block: Pass unaligned discard requests to drivers |
Date: |
Fri, 11 Nov 2016 08:22:40 -0600 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 |
On 11/11/2016 04:58 AM, Peter Lieven wrote:
> Am 11.11.2016 um 00:10 schrieb Eric Blake:
>> Discard is advisory, so rounding the requests to alignment
>> boundaries is never semantically wrong from the data that
>> the guest sees. But at least the Dell Equallogic iSCSI SANs
>> has an interesting property that its advertised discard
>> alignment is 15M, yet documents that discarding a sequence
>> of 1M slices will eventually result in the 15M page being
>> marked as discarded, and it is possible to observe which
>> pages have been discarded.
>>
>> @@ -2468,11 +2464,25 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState
>> *bs, int64_t offset,
>>
>> max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard,
>> INT_MAX),
>> align);
>> - assert(max_pdiscard);
>> + assert(max_pdiscard >= bs->bl.request_alignment);
>>
>> while (count > 0) {
>> int ret;
>> - int num = MIN(count, max_pdiscard);
>> + int num = count;
>> +
>> + if (head) {
>> + /* Make a small request up to the first aligned sector. */
>> + num = MIN(count, align - head);
>> + head = (head + num) % align;
>
> Is there any way that head is != 0 after this?
The corresponding write_zero code (after my other pending patch) is:
num = MIN(MIN(count, max_transfer), align - head);
where it is indeed possible that head is still nonzero after this. But
you are correct that for discard, as written, head is always zero after
this assignment.
On the other hand, I'm wondering if I should additionally be prepared to
split twice: suppose you have a device with a 512 request_alignment, but
the discard request is byte-aligned. If the device can discard 512
bytes at a time (qcow2 can, if the file was configured with 512-byte
clusters), but the request is not on a 512-byte boundary, it may make
sense to do a really small request up to request_alignment, then a
larger request up to pdiscard_alignment, before doing the bulk at proper
alignment.
Should I send a v2 along those lines? I'm still working up a blkdebug
patch that lets us add qemu-iotests, that will be my ultimate proof of
whether I can indeed come up with a case that differs by whether I
subdivide the head into as many as two parts.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature