qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: qcow2 preallocation and backing files


From: Vladimir Sementsov-Ogievskiy
Subject: Re: qcow2 preallocation and backing files
Date: Wed, 20 Nov 2019 12:27:53 +0000

20.11.2019 15:06, Alberto Garcia wrote:
> Hi,
> 
> as we discussed yesterday on IRC there's an inconsistency in the way
> qcow2 preallocation works.
> 
> Let's create an image and fill it with data:
> 
>     $ qemu-img create -f raw base.img 1M
>     $ qemu-io -f raw -c 'write -P 0xFF 0 1M' base.img
> 
> Now QEMU won't let us create a new image backed by base.img using
> preallocation:
> 
>     $ qemu-img create -f qcow2 -b base.img -o preallocation=metadata 
> active.img
>     qemu-img: active.img: Backing file and preallocation cannot be used at 
> the same time
> 
> The reason is that once a cluster is preallocated (i.e. it has a valid
> L2 entry pointing to a host offset) the guest won't see the contents
> of the backing file, so those options conflict with each other.
> 
> It is possible however to create an image that is smaller than
> the backing file and then resize it using preallocation. In this
> case qemu-img will happily accept any --preallocation option, with
> different results from the guest's point of view:
> 
>     # This reads as 0xFF (the data comes from base.img)
>     $ qemu-img create -f qcow2 -b base.img active.img 512K
> 
>     # The second half of the image also reads as 0xFF
>     $ qemu-img resize --preallocation=off active.img 1M
> 
>     # Here the second half reads as zeroes
>     $ qemu-img resize --preallocation=metadata active.img 1M
> 
> Apart from "qemu-img resize", the QMP block-resize command can also
> extend an image like this, although it always uses PREALLOC_MODE_OFF
> and the user cannot change that.
> 
> It does not seem right that the guest-visible data changes depending
> on the preallocation mode. This could be solved by returning an error
> when (backing_bs(blk_bs(blk)) && prealloc != PREALLOC_MODE_OFF) on
> img_resize().
> 
> The important question is however: what behavior is the right one?
> Should growing an image that was smaller than the backing file return
> zeroes, or data from the backing file? I would opt for the latter, for
> simplicity and consistency with the current behavior of block-resize,
> although it was pointed out that this could be a security problem (I'm
> not sure that I agree with that, but we can discuss it).

I'm for zeros way.

1. I'm sure that if guest after some operation may get access to that data
which it should not see, it's a security problem.

2. Seing backing file through new clusters is inconsistent with how read works:
read will return zeroes, not data from backing. Consider the following example:

      0         x     y
top: [---------------]
mid: [---------]
base:[111111111111111]

reading from [x,y] from top will return zeroes, not ones.

So, if we consider data after EOF as zeroes (not UNALLOCATED clusters), we 
should
not make these clusters UNALLOCATED after truncation.

3. Also, the latter way is inconsistent with discard. Discarded regions returns
zeroes, not clusters from backing. I think discard and truncate should behave
in the same safe zero way.

> 
> This also has a consequence on how preallocation should be implemented
> for images with subclusters. Extended L2 entries allow us to allocate
> a cluster but leave each one of its subclusters unallocated. That
> would allow us to have a cluster that is simultaneously allocated but
> whose data is read from the backing file. But it's up to us to decide
> if that's what we should do when resizing an image.
> 
> Berto
> 


-- 
Best regards,
Vladimir

reply via email to

[Prev in Thread] Current Thread [Next in Thread]