qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Can I only commit from active image to corresponding ra


From: Max Reitz
Subject: Re: [Qemu-devel] Can I only commit from active image to corresponding range of its backing file by qemu cmd?
Date: Thu, 13 Sep 2018 22:44:09 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0

On 13.09.18 22:01, Eric Blake wrote:
> On 9/13/18 1:37 PM, Max Reitz wrote:
>> On 13.09.18 19:05, Eric Blake wrote:

[...]

>>> $ qemu-io -c 'discard 0 1m' --image-opts
>>> driver=qcow2,backing=,file.driver=file,file.filename=img.003
>>> warning: Use of "backing": "" is deprecated; use "backing": null instead
>>> discard 1048576/1048576 bytes at offset 0
>>> 1 MiB, 1 ops; 0.0002 sec (4.399 GiB/sec and 4504.5045 ops/sec)
>>>
>>> doesn't work, as 'discard' causes img.003 to now make things read as
>>> zero rather than deferring to the backing chain,
>>
>> Which is intentional because making data re-appear from the backing
>> chain can be a security issue, as far as I remember.
> 
> It can be a potential issue if there is a backing file (exposing data
> that you thought was wiped is not fun).  But where there is NO backing
> file, it's overly cautious, and gets in our way (we read all zeros from
> a file with no backing, whether the cluster is marked as 0 or as
> defer-to-backing).  I'm okay if we still keep the overly cautious way by
> default, but having a knob to say "discard this, and I really do mean
> discard rather than read back as 0" would be useful in qemu (after all,
> that's what fallocate(FALLOC_FL_NO_HIDE_STALE) has recently been used
> for in the kernel, as the knob for whether discarding on a block device
> must read back as zero or may go faster [2]).
> 
> [2] https://lore.kernel.org/patchwork/patch/953421/

Maybe, but I don't see how this would improve anything for qcow2 v3.
Fully unmapping a cluster or making it a zero cluster is basically the
same.  Why would we make qcow2 present effectively random data, when we
can easily make it well-defined?

(It may make a difference for raw images, but this discussion is mainly
about qcow2 and how you could abuse such a feature for making backing
file content reappear. :-))

I just realized I myself have a need to punch such holes, though.  Deep
on my todo list there's this point of making active commit punch holes
in the overlay, because currently, it writes data twice: Once to the
overlay, once to the backing file (like every mirror).  But if for the
respective cluster the backing file is visible from the overlay, we
could simply punch a hole in it and could skip writing the data there.

[...]

>> Basically, there is only one way to reliably make an image pass through
>> data from its backing files again.  Well, two, actually.  One is
>> qemu-img commit, which (for compatibility, mainly) makes the image empty
>> after the commit.
> 
> And only if you did NOT use the -b option (in other words, it only
> empties the file if you are committing to the immediate backing file,
> not deep in the chain).

Yep, because all images between base and top will possibly become
garbage due to that operation.  So if we emptied top, it'd become
garbage, too.  Which is why we don't empty it, so it it stays valid.

And technically, also only if you did not use the -d option, because
that skips the emptying.  Which is useful if you're just going to delete
the image anyway (as in the example I gave here).

>>  The other is just throwing the image away and
>> re-creating it from scratch.
> 
> Well yeah, there's that. But now you have a transient problem of extra
> pressure on your storage, while you have duplicated blocks between old
> and new images, prior to being able to remove the old image.  If the
> goal is to make img.000 not grow during the commit, I was assuming that
> we are already storage-constrained, and any solution that does in-place
> modification is therefore better than one that has to create yet another
> copy of data, even if the end result is the same once all operations
> have finished.

What if you use qemu-img create -n to overwrite it?

(But it's all just academic anyway.  What you'd want is a way to discard
parts of an image, and we just don't have that.)

[...]

>>
>> Now let's set the backing files.  img.003.commit.000 has only data that
>> goes into img.000, so that goes there, and img.003.nocommit is going to
>> replace our old img.003, so that goes where that was:
>>
>> $ qemu-img rebase -u -b img.000 img.003.commit.000
>> $ qemu-img rebase -u -b img.002 img.003.nocommit
>>
>> And now let's commit:
>>
>> $ qemu-img commit img.003.commit.000
>>
>> And let's clean up:
>>
>> $ rm img.003.commit.000
>> $ mv img.003.nocommit img.003
>>
>> Done.
> 
> Done, but with temporary storage usage higher than doing it in place.

Yes, that's true.

>> (If you want to commit all three parts of img.003 into the three
>> different base images, you would create img.003.commit.001 and
>> img.003.commit.002 similarly as above, and then commit those into the
>> respective base images.  Then you'd just rm img.003* and you're back to
>> the original state.)
> 
> Your solution of qemu-img convert to concatenate null-co with an offset
> of img.003 is nice.

I'm not sure whether I'd call it "nice".  "Interesting" probably, yes.

But it is rather obscure, probably nobody outside of qemu-img developers
know that you can do something like that.  Also, it's only an offline
solution that doesn't readily translate into an online one.

Maybe you could mirror img.003 (filtered) to img.003.nocommit, then
complete the mirror, so the latter replaces the former, and then mirror
the to-be-committed part of img.003 (which is no longer in use) to
img.003.commit.000?  And then...  Well, what exactly.  The right thing
would probably to attach img.003.commit.000 as an overlay of img.000
(currently requires a blockdev-del and blockdev-add with backing=img.000
(or backing=null and then blockdev-snapshot, but why)).  And then you'd
commit it down, if blockers allow it.

In that time, img.003.nocommit could have received new data in the
img.000 area, though, but that's probably OK.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]