[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations |
Date: |
Wed, 14 Mar 2012 13:49:48 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:10.0.1) Gecko/20120216 Thunderbird/10.0.1 |
Il 14/03/2012 13:37, Kevin Wolf ha scritto:
> Am 14.03.2012 13:14, schrieb Paolo Bonzini:
>>> Paolo mentioned a use case as a fast way for guests to write zeros, but
>>> is it really faster than a normal write when we have to emulate it by a
>>> bdrv_write with a temporary buffer of zeros?
>>
>> No, of course not.
>>
>>> On the other hand we have
>>> the cases where discard really means "I don't care about the data any
>>> more" and emulating it by writing zeros is just a waste of resources there.
>>>
>>> So I think we only want to advertise that discard zeroes data if we can
>>> do it efficiently. This means that the format does support it, and that
>>> the device is able to communicate the discard granularity (= cluster
>>> size) to the guest OS.
>>
>> Note that the discard granularity is only a hint, so it's really more a
>> maximum suggested value than a granularity. Outside of a cluster
>> boundary the format would still have to write zeros manually.
>
> You're talking about SCSI here, I guess? Would be one case where being
> able to define sane semantics for virtio-blk would have been an
> advantage... I had hoped that SCSI was already sane, but if doesn't
> distinguish between "I don't care about this any more" and "I want to
> have zeros here", then I'm afraid I can't call it sane any more.
It does make the distinction. "I don't care" is UNMAP (or WRITE
SAME(16) with the UNMAP bit set); "I want to have zeroes" is WRITE
SAME(10) or WRITE SAME(16) with an all-zero payload.
> We can make the conditions even stricter, i.e. allow it only if protocol
> can pass through discards for unaligned requests. This wouldn't free
> clusters on an image format level, but at least on a file system level.
>
>> Also, Linux for example will only round the number of sectors down to
>> the granularity, not the start sector. Rereading the code, for SCSI we
>> want to advertise a zero granularity (aka do whatever you want),
>> otherwise we may get only misaligned discard requests and end up writing
>> zeroes inefficiently all the time.
>
> Does this make sense with real hardware or is it a Linux bug?
It's a bug, SCSI defines the "optimal unmap request starting LBA" to be
"(n × optimal unmap granularity) + unmap granularity alignment".
>> The problem is that advertising discard_zeroes_data based on the backend
>> calls for trouble as soon as you migrate between storage formats,
>> filesystems or disks.
>
> True. You would have to emulate if you migrate from a source that can
> discard to zeros efficiently to a destination that can't.
>
> In the end, I guess we'll just have to accept that we can't fix bad
> semantics of ATA and SCSI, and just need to decide whether "I don't
> care" or "I want to have zeros" is more common. My feeling is that "I
> don't care" is the more useful operation because it can't be expressed
> otherwise, but I haven't checked what guests really do.
Yeah, guests right now only use it for unused filesystem pieces, so the
"do not care" semantics are fine. I also hoped to use discard to avoid
blowing up thin-provisioned images when streaming. Perhaps we can use
bdrv_has_zero_init instead, and/or pass down the copy-on-read flag to
the block driver.
Anyhow, there are some patches from this series that are relatively
independent and ready for inclusion, I'll extract them and post them
separately.
Paolo
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, (continued)
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Kevin Wolf, 2012/03/09
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Paolo Bonzini, 2012/03/09
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Richard Laager, 2012/03/10
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Paolo Bonzini, 2012/03/12
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Kevin Wolf, 2012/03/12
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Richard Laager, 2012/03/13
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Paolo Bonzini, 2012/03/14
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Kevin Wolf, 2012/03/14
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Paolo Bonzini, 2012/03/14
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Kevin Wolf, 2012/03/14
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations,
Paolo Bonzini <=
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Kevin Wolf, 2012/03/14
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Christoph Hellwig, 2012/03/24
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Christoph Hellwig, 2012/03/24
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Richard Laager, 2012/03/26
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Kevin Wolf, 2012/03/27
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Christoph Hellwig, 2012/03/24
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Daniel P. Berrange, 2012/03/26
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Christoph Hellwig, 2012/03/26
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Richard Laager, 2012/03/14
- Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations, Paolo Bonzini, 2012/03/15