qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] storing machine data in qcow images?


From: Max Reitz
Subject: Re: [Qemu-devel] storing machine data in qcow images?
Date: Wed, 6 Jun 2018 15:14:03 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0

On 2018-06-06 14:13, Michal Suchánek wrote:
> On Wed, 6 Jun 2018 13:52:35 +0200
> Max Reitz <address@hidden> wrote:
> 
>> On 2018-06-06 13:43, Michal Suchánek wrote:
>>> On Wed, 6 Jun 2018 13:32:47 +0200
>>> Max Reitz <address@hidden> wrote:
>>>   
>>>> On 2018-06-06 13:19, Michal Suchánek wrote:  
>>>>> On Wed, 6 Jun 2018 13:02:53 +0200
>>>>> Max Reitz <address@hidden> wrote:

[...]

>>>>>> What I'm trying to get at is that qcow2 was not designed to be a
>>>>>> container format for arbitrary files.  If you want to make it
>>>>>> such, I'm sure there are existing formats that work better.    
>>>>>
>>>>> Such as?    
>>>>
>>>> ext2?  
>>>
>>> So you want an ext2 driver in qemu instead of expanding qcow2 to
>>> work not only for a single disk but also for an appliance?  
>>
>> Yes, because ext2 was designed to be a proper filesystem.  I'm not an
>> FS designer.  Well, not a good one anyway.  So I don't trust myself on
>> extending qcow2 to be a good FS -- and why would I, when there are
>> already numerous FS around.
> 
> Do you expect that performance of qemu using qcow2 driver over ext2
> driver will be better than using qcow driver directly with some part
> semi-permanently occupied by a configuration blob? My bet is not.

If you want to store multiple disk images in a single file?  I would
think so, yes.  With qcow2, I would assume it leads to fragmentation.  I
would hope that proper filesystems can mitigate this.
> The ext* drivers are designed to work with kernel VM infrastructure
> which must be tuned for different usage scenarios and you would have to
> duplicate that tuning in qemu to get competitive performance. Also you
> get qcow2 and ext2 metadata which must be allocated, managed, etc. You
> get more storage and performance overhead for no good reason.

Yes, there is a good reason.  You can add arbitrary configuration
options without having to worry about me.

Seriously, though, a real FS would allow you to be more expressive and
really do what you want without having to work around the quirks that
adding a not-real-FS in the most simple way possible to qcow2 would
bring with it.

Because this is part of my fear, that we now add a very simple blob for
just a sprinkle of data.  But over time it gets more and more complex
because we want to store more and more data to make things ever more
convenient[1], we notice that we need more features, the format gets
more complex, and in the end we have an FS that is just worse than a
real FS.

[1] And note that if I'm convinced to store VM configuration data in
qemu, I will agree that we can store any data in there and it would be
nice if any VM could be provisioned and used that way.

> On the other hand, qcow is designed for storing VM disk data and
> hopefully was tuned to do that decently over the years. The primary use
> case remains storing VM disk data. Adding a configuration blob does not
> change that.

True.  So the argument is that qcow2 may be worse for storing arbitrary
data, but we don't have performance requirements for that; but we do
have performance requirements for disk data and adding another format
below qcow2 will not make it better.

I do think it is possible to not make things worse with a format under
qcow2, but that may require additional complexity, that you think is
pointless.

I understand that you think that, but I still believe that putting the
configuration into qcow2 is just the wrong way around and will fall on
our feet in the long run.

>>>>>>>> Unless I have got something terribly wrong (which is indeed a
>>>>>>>> possibility!), to me this proposal means basically to turn
>>>>>>>> qcow2 into (1) a VM description format for qemu, and (2) to
>>>>>>>> turn it into an archive format on the way.      
>>>>>>>
>>>>>>> And if you go all the way you can store multiple disks along
>>>>>>> with the VM definition so you can have the whole appliance in
>>>>>>> one file. It conveniently solves the problem of synchronizing
>>>>>>> snapshots across multiple disk images and the question where to
>>>>>>> store the machine state if you want to suspend it.       
>>>>>>
>>>>>> Yeah, but why make qcow2 that format?  That's what I completely
>>>>>> fail to understand.
>>>>>>
>>>>>> If you want to have a single VM description file that contains
>>>>>> the VM configuration and some qcow2/raw/whatever files along
>>>>>> with it for the guest disk data, sure, go ahead.  But why does
>>>>>> the format of the whole thing need to be qcow2?    
>>>>>
>>>>> Because then qemu can access the disk data from the image directly
>>>>> without any need for extraction, copying to different file,
>>>>> etc.    
>>>>
>>>> This does not explain why it needs to be qcow2.  There is
>>>> absolutely no reason why you couldn't use qcow2 files in-place
>>>> inside of another file.  
>>>
>>> qemu cannot read the disk data from the file in-place.  
>>
>> Hu?  Why not?
> 
> Well, it can possibly read the image if it happens to be continuous. It
> will not be able to update it without a fs driver, however.

Yes, but first, such an FS driver would be possible (and as long as we
don't need real complexity, it could be very simple, like just using an
offset in a tar file and then just adjust the file length field on
allocations beyond the EOF).

And secondly, I think adding another format has the advantage of easier
deprecation.  If we think we need something more complex, we are free to
design that and throw away the old format.  But if we add something to
qcow2, I would think it is there to stay.

So, yes, for qcow2 we might want to design something (overly?) complex
from that start that we hope will fulfill all our needs (which it won't,
because things never turn out that way).  But if we'd add a new format,
we could keep it simple in the beginning and start over later.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]