qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] Performance impact of the qcow2 overlap checks


From: Max Reitz
Subject: Re: [Qemu-block] Performance impact of the qcow2 overlap checks
Date: Wed, 25 Jan 2017 15:03:36 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0

On 24.01.2017 17:43, Alberto Garcia wrote:
> On Mon 23 Jan 2017 05:29:49 PM CET, Max Reitz wrote:
>> Refcount data will only be queried when writing data to the image. If
>> that data has been overwritten, we have a chance that it is being set
>> to 0 (which is rather large because 0 generally has a higher
>> probability of being a part of data, admittedly). But we also have a
>> chance that it is set to something else, which generally will be
>> greater than the number of internal snapshots (+ 1). Therefore, such
>> corruption should be easily detectable before much data is wrongly
>> overwritten.
>>
>> The drawbacks with this approach would be the following:
>> (1) Is printing a warning enough to make the user shut down the VM as
>> fast as possible and run qemu-img check?
> 
>> (2) It is legal to have a greater refcount than the number of internal
>> snapshots plus one. qemu never produces such images, though (or does
>> it?). Could there be existing images where users will be just annoyed by
>> such warnings? Should we add a runtime option to disable them?
> 
> I don't think it's legal, or is there any reason why it would be?

Yes, the spec doesn't disallow it, so it's legal. :-)

I had thought about using it for the conversion of zero clusters to data
clusters in compat-level downgrading. You would only create a couple of
zero-filled data clusters and then just reference them very often.

We could think about disallowing it, but I don't think "because we want
better metadata corruption detection" is a very good reason for that.

> I'll try to summarize my opinion:
> 
> - If using that refcount method that you propose we can guarantee that
>   the image is corrupted then that should clearly cause an I/O error,
>   and I would prevent further writes to the image (if that's possible).

Depends. We could make it an error based on the fact that it probably is
an error. But it is a technically legal configuration.

> - If this method cannot guarantee that it's corrupted but it can only
>   give us an indication that it could be then I don't think I'd bother
>   and I'd simply keep the current overlap check.

Depends. :-)

In practice, I think it's a corruption 100 % of the time. In theory, it
isn't.

> - Printing a warning and expecting the user to see it doesn't seem like
>   a good way to deal with data corruption.

Right. But as long as it is a legal configuration, I wouldn't make it an
error.

I think it could make sense to limit refcounts to snapshot count + 1 in
the spec -- the only issue I take with that is that I'm not sure "better
corruption detection" is a good reason to do so.

Also, while it makes sense to do the change to the spec *now*, it may be
difficult to revert in the future if needed. Let's say someone comes up
with a valid reason for wanting to use refcounts for something else than
internal snapshots. Then, we could simply allow arbitrary refcounts
again. But then-old qemus may then report such new images as corrupted
all the time.

We could circumvent this by adding a compatible has-arbitrary-refcounts
flag at the same time. If that flag is set, refcounts may be arbitrary;
otherwise, they are limited by the snapshot count.

But again, that seems a bit much just so we can detect refcount
overwrites...

>> And of course another approach I already mentioned would be to scrap
>> the overlap checks altogether once we have image locking (and I guess
>> we can keep them around in their current form at least until then).
> 
> I think the overlap checks are fine, at least in my tests I only found
> problems with one of them, and only in some scenarios(*). So if we
> cannot optimize them easily I'd simply tell the user about the risks and
> suggest to disable them. Maybe the only thing that we need is simply
> good documentation.

Well, overlap-check=none should be simple enough.

>                     What are the chances of corrupted qcow2 images that
> are not caused by the user messing up? Do we know how many cases of
> those are?

I personally don't know of any.

> I think the most obvious candidate for optimization is refcount-block,
> and as I said it's the check what would create the bottleneck in most
> common scenarios. The optimization is simple: if the size of the qcow2
> image is 7GB then you only need to check the first 4 entries in the
> refcount table.
> 
> I can think of two problems with this, which is why I haven't sent a
> patch yet:
> 
> (1) This needs the current size of the image every time we want to
>     perform that check, and that means I/O.

Could be cached (every time some refcount is updated, you update a
max_refcounted_cluster variable in the qcow2 state), but that means code.

> (2) The unused entries that we're skipping in the refcount table should
>     be 0, but what if they're not? That would be a sign of data
>     corruption. But should we bother?

No, because we have overlap checks exactly so this doesn't happen.

>                                       Those entries will be checked
>     before they're used if the image grows large enough.
> 
> (*)I actually noticed (I'm talking about a qcow2 image stored in RAM
> now) that disabling the refcount-block check increases dramatically
> (+90%) the number of IOPS when using virtio-blk, but doesn't seem to
> have any effect (my tests even show a slightly negative effect!!) when
> using virtio-scsi. Does that make sense? Am I hitting a SCSI limit or
> what would be the reason for this?

That's for someone else to answer. O:-)

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]