qemu-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Exporting qcow2 images as raw data from ova file with qemu-nbd


From: Nir Soffer
Subject: Re: Exporting qcow2 images as raw data from ova file with qemu-nbd
Date: Mon, 29 Jun 2020 16:08:28 +0300

On Mon, Jun 29, 2020 at 3:06 PM Kevin Wolf <kwolf@redhat.com> wrote:
>
> Am 26.06.2020 um 21:42 hat Nir Soffer geschrieben:
> > On Tue, Jun 23, 2020 at 1:21 AM Nir Soffer <nsoffer@redhat.com> wrote:
> > >
> > > I'm trying to export qcow2 images from ova format using qemu-nbd.
> > >
> > > I create 2 compressed qcow2 images, with different data:
> > >
> > > $ qemu-img info disk1.qcow2
> > > image: disk1.qcow2
> > > file format: qcow2
> > > virtual size: 200 MiB (209715200 bytes)
> > > disk size: 384 KiB
> > > ...
> > >
> > > $ qemu-img info disk2.qcow2
> > > image: disk2.qcow2
> > > file format: qcow2
> > > virtual size: 200 MiB (209715200 bytes)
> > > disk size: 384 KiB
> > > ...
> > >
> > > And packed them in a tar file. This is not a valid ova but good enough
> > > for this test:
> > >
> > > $ tar tvf vm.ova
> > > -rw-r--r-- nsoffer/nsoffer 454144 2020-06-22 21:34 disk1.qcow2
> > > -rw-r--r-- nsoffer/nsoffer 454144 2020-06-22 21:34 disk2.qcow2
> > >
> > > To get info about the disks in ova file, we can use:
> > >
> > > $ python -c 'import tarfile; print(list({"name": m.name, "offset":
> > > m.offset_data, "size": m.size} for m in tarfile.open("vm.ova")))'
> > > [{'name': 'disk1.qcow2', 'offset': 512, 'size': 454144}, {'name':
> > > 'disk2.qcow2', 'offset': 455168, 'size': 454144}]
> > >
> > > First I tried the obvious:
> > >
> > > $ qemu-nbd --persistent --socket=/tmp/nbd.sock --read-only --offset=512 
> > > vm.ova
> > >
> > > And it works, but it exposes the qcow2 data. I want to raw data so I
> > > can upload the guest
> > > data to ovirt, where is may be converted to qcow2 format.
> > >
> > > $ qemu-img info --output json "nbd+unix://?socket=/tmp/nbd.sock"
> > > {
> > >     "virtual-size": 209715200,
> > >     "filename": "nbd+unix://?socket=/tmp/nbd.sock",
> > >     "format": "qcow2",
> > >  ...
> > > }
> > >
> > > Looking in qemu manual and qapi/block-core.json, I could construct this 
> > > command:
> > >
> > > $ qemu-nbd --persistent --socket=/tmp/nbd.sock --read-only
> > > 'json:{"driver": "qcow2", "file": {"driver": "raw", "offset": 512,
> > > "size": 454144, "file": {"driver": "file", "filename": "vm.ova"}}}'
> > >
> > > And it works:
> > >
> > > $ qemu-img info --output json "nbd+unix://?socket=/tmp/nbd.sock"
> > > {
> > >     "virtual-size": 209715200,
> > >     "filename": "nbd+unix://?socket=/tmp/nbd.sock",
> > >     "format": "raw"
> > > }
> > >
> > > $ qemu-img map --output json "nbd+unix://?socket=/tmp/nbd.sock"
> > > [{ "start": 0, "length": 104857600, "depth": 0, "zero": false, "data":
> > > true, "offset": 0},
> > > { "start": 104857600, "length": 104857600, "depth": 0, "zero": true,
> > > "data": false, "offset": 104857600}]
> > >
> > > $ qemu-img map --output json disk1.qcow2
> > > [{ "start": 0, "length": 104857600, "depth": 0, "zero": false, "data": 
> > > true},
> > > { "start": 104857600, "length": 104857600, "depth": 0, "zero": true,
> > > "data": false}]
> > >
> > > $ qemu-img convert -f raw -O raw nbd+unix://?socket=/tmp/nbd.sock 
> > > disk1.raw
> > >
> > > $ qemu-img info disk1.raw
> > > image: disk1.raw
> > > file format: raw
> > > virtual size: 200 MiB (209715200 bytes)
> > > disk size: 100 MiB
> > >
> > > $ qemu-img compare disk1.raw disk1.qcow2
> > > Images are identical.
> > >
> > > I wonder if this is the best way to stack a qcow2 driver on top of a
> > > raw driver exposing a range from a tar file.
>
> Yes, if you want to specify an offset and a size to access only part of
> a file as the disk image, sticking a raw driver in the middle is the way
> to go.
>
> > Other related challenges with this are:
> >
> > 1. probing image format
> >
> > With standalone images, we probe image format using:
> >
> >     qemu-img info image
> >
> > I know probing is considered dangerous, but I think this ok when user
> > run this code on his machine, on an image they want to upload to
> > oVirt. On a hypervisor we use prlimit to limit the resources used by
> > qemu-img, so we can use the same solution also when running by a user
> > if needed.
> >
> > However not being able to probe image format is a usability issue. It
> > does not make sense that qemu-img cannot probe image format safely, at
> > least for qcow2 format.
> >
> > I can get image info using:
> >
> > $ qemu-img info 'json:{"driver": "qcow2", "file": {"driver": "raw",
> > "offset": 1536, "file": {"driver": "file", "filename":
> > "fedora-32.ova"}}}'
> > image: json:{"driver": "qcow2", "file": {"offset": 1536, "driver":
> > "raw", "file": {"driver": "file", "filename": "fedora-32.ova"}}}
> > file format: qcow2
> > virtual size: 6 GiB (6442450944 bytes)
> > disk size: 645 MiB
> > cluster_size: 65536
> > Format specific information:
> >     compat: 1.1
> >     lazy refcounts: false
> >     refcount bits: 16
> >     corrupt: false
> >
> > But there is no way to probe the format, unless I try first with
> > qcow2, and consider the image as raw otherwise.
>
> Just leave out the top-level "driver" option. This isn't -blockdev
> (which does indeed require a "driver"), but uses the same logic as
> -drive and therefore supports format probing:
>
> $ ./qemu-img info 
> 'json:{"file":{"driver":"raw","offset":512,"size":2424832,"file":{"filename":"/tmp/test.ova"}}}'
> image: json:{"driver": "qcow2", "file": {"offset": 512, "driver": "raw", 
> "size": 2424832, "file": {"driver": "file", "filename": "/tmp/test.ova"}}}

Nice!

> file format: qcow2
> virtual size: 64 MiB (67108864 bytes)
> disk size: 2.32 MiB
> cluster_size: 65536
> Format specific information:
>     compat: 1.1
>     compression type: zlib
>     lazy refcounts: false
>     refcount bits: 16
>     corrupt: false
>
> > We can parse the qcow2 header manually, as we already do in oVirt
> > engine UI in javascript:
> > https://github.com/oVirt/ovirt-engine/blob/9d48ea6274fdd1bef3fc8e309f9161be3b540890/frontend/webadmin/modules/uicommonweb/src/main/java/org/ovirt/engine/ui/uicommonweb/models/storage/ImageInfoModel.java#L103
> >
> > We have used this code for 5 years and had no issues with it yet.
> >
> > In the worst case, if we fail to detect, or let the user upload a
> > qcow2 files oVirt does not
> > support, the uload will fail at the end, in the verification step,
> > when we run check the
> > uploaded image using "qemu-img info". This is done using prlimit since
> > we treat this
> > image as untrusted.
> >
> > I think it would be useful if the qemu project was publishing
> > libraries in C/python/javascript
> > supporting format probing for qcow2 format.
> >
> > 2. getting image virtual size
> >
> > So we can use qemu-img info with a custom json: filename, but this is
> > very complicated and error prone.
>
> How is this complicated and error prone? I would understand the
> reasoning for human use (maybe not really error prone, but the syntax is
> somewhat hard to remember), but isn't the context here use by a machine?

Yes, this is complicated for humans, meaning that someone need to hide the
complexity for the user. For a machine the json syntax is great.

It would be even nicer if we could also get the block graph in qemu as json
for debugging and understanding how things are wired up under the hood.

> > 3. measuring image required size when converting to qcow2 image on block 
> > device
> >
> > This works if we know the image format:
> >
> > $ qemu-img measure -O qcow2 'json:{"driver": "qcow2", "file":
> > {"driver": "raw", "offset": 1536, "file": {"driver": "file",
> > "filename": "fedora-32.ova"}}}'
> > required size: 1381302272
> > fully allocated size: 6443696128
> >
> > But it is complicated.
> >
> > Can we have better support in qemu-img/qemu-nbd for accessing images
> > in a tar file?
> >
> > Maybe something like:
> >
> >     qemu-img info tar://vm.ova?member=fedora-32.qcow2
>
> The problem with such convenient shortcut URLs is that they always fail
> to cover more than the simplest cases. For example, how would you
> express that you want to use a file from a tar file accessed through NBD
> or HTTP?
>
> Of course, even if you have to revert to JSON (or the equivalent dotted
> key syntax) for these cases, you would still save the work to find out
> the right offsets yourself, so the idea does have some merit.

A tar driver can parse the tar file, find the requested file and use the right
offset and size.

So we can have:

    {"file": {"driver": "tar",
              "file-name": "disk1.qcow2",
              "file": {"driver": "curl",
                       "url": ...

> I wouldn't reject patches to add such a driver.
>
> > This can return information on the file named "fofora-32.qcow2" in the
> > tar file "vm.ova".
> >
> > image:  tar://vm.ova?member=fedora-32.qcow2
> > file format: qcow2
> > virtual size: 6 GiB (6442450944 bytes)
> > disk size: 645 MiB
> > cluster_size: 65536
> > Format specific information:
> >     compat: 1.1
> >     lazy refcounts: false
> >     refcount bits: 16
> >     corrupt: false
> >
> > $ qemu-img measure -O qcow2 tar://vm.ova?member=fedora-32.qcow2
> > required size: 1381302272
> > fully allocated size: 6443696128
> >
> > What if we had a tar driver that can be used like this:
> >
> > {"driver": "qcow2",
> >  "file": {"driver": "tar",
> >           "member": "fedora-32.qcow2",
> >           "file": {"driver": "file",
> >                     "filename": "vm.ova"}}}
> >
> >  This driver can be implemented using tar parser and a raw driver
> > using offset and size.
> >
> > So maybe we don't need a driver, but code in qemu-img parsing tar
> > format, and building the right graph using existing drivers.
>
> I don't think that qemu-img should have any file format code (which
> would then be missing from QEMU proper and the other tools). A block
> driver is the right approach.
>
> > Regardless of how we implement it, qemu-img will have basic support
> > for ova format, which sounds like a good thing, even if ova format is
> > horrible and non-standard. Users don't care about the details, only
> > about compatibility.
>
> Yes, indeed.
>
> Kevin
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]