Re: qemu-img convert vs writing another copy tool

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: qemu-img convert vs writing another copy tool

From:	Eric Blake
Subject:	Re: qemu-img convert vs writing another copy tool
Date:	Thu, 23 Jan 2020 13:21:28 -0600
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1

On 1/23/20 12:35 PM, Richard W.M. Jones wrote:

I guess some people are aware that virt-v2v, which is a tool which
converts guests from VMware to run on KVM, and some other
OpenStack-OpenStack migration tools we have, use "qemu-img convert" to
copy the data around.

Historically we've had bugs here.  The most recent was discussed in
the thread on this list called "Bug? qemu-img convert to preallocated
image makes it sparse"
(https://www.mail-archive.com/address@hidden/msg60479.html)

We've been kicking around the idea of writing some alternate tool.  My
proposal would be a tool (not yet written, maybe it will never be
written) called nbdcp for copying between NBD servers and local files.
An outline manual page for this proposed tool is attached.

Some of the things which this tool might do which qemu-img convert
cannot do right now:

  - Hint that the target already contains zeroes.  It's almost always
    the case that we know this, but we cannot tell qemu.  This was the
    cause of a big performance regression last year.


This has just recently been proposed:
https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg03617.html

I'm also working on a patch that I will post soon that extends the NBDprotocol to advertise this information (it will help the situation wherethe destination is NBD, but as that requires a new enough server toadvertise the information, having the feature as a command-line optionallows the same speedup even without the server supporting the extension).


  - Declare that we want the target to be either sparse or
    preallocated.  qemu-img convert can sort of do this in a
    round-about way (create the target in advance and use the -n
    option), but also it's broken at the moment.

  - NBD multi-conn.  In my tests this makes a really massive
    performance difference in certain situations.  Again, virt-v2v has
    a lot of information that we cannot pass to qemu: we know, for
    example, exactly if the server supports the feature, how many
    threads are available, in some situations even have information
    about the network and backing disks that the data will travel over
    / be stored on.

Multi-conn for reading the source allows better parallelism. Multi-connfor writing is a bit trickier - it should be safe if the differentconnections are only touching distinct segments of the export (nooverlaps), but as qemu does not advertise multiconn in such situations,you may still need a command-line switch to force multiple writers inspite of the server not advertising it. Here, I'm not aware of anyonewith patches underway, but I also think it would be a good ground forexploring.


  - Machine-parsable progress bars.  You can, sort of, parse the
    progress bar from qemu-img convert, but it's not as easy as it
    could be.  In particular it would be nice if the format was treated
    as ABI, and if there was a way to have the tool write the progress
    bar info to a precreated file descriptor.


Would be nice, but I'm not aware of anyone currently planning to add it.


  - External block lists.  This is a rather obscure requirement, but
    it's necessary in the case where we can get the allocated block map
    from another source (eg. pyvmomi) and then want to use that with an
    NBD source that does not support extents (eg. nbdkit-ssh-plugin /
    libssh / sftp).  [Having said that, it may be possible to implement
    this as an nbdkit filter, so maybe this is not a blocking feature.]

How are you intending to use this? I'm guessing you have some way offeeding in information to qemu-img of which portions of the source imageyou want to copy, and ignore remaining portions.

Note that it IS already possible to use qemu's copy-on-read feature as away to copy only a subset of a source file over to a destination file.When demonstrating incremental backup, I wrote this shell function:


copyif() {
if test $# -lt 2 || test $# -gt 3; then
  echo 'usage: copyif src dst [bitmap]'
  return 1
fi
if test -z "$3"; then
  map_from="-f raw nbd://localhost:10809/$1"
  state=true
else
  map_from="--image-opts driver=nbd,export=$1,server.type=inet"
  map_from+=",server.host=localhost,server.port=10809"
  map_from+=",x-dirty-bitmap=qemu:dirty-bitmap:$3"
  state=false
fi
$qemu_img info -f raw nbd://localhost:10809/$1 || return
$qemu_img info -f qcow2 $2 || return
ret=0
$qemu_img rebase -u -f qcow2 -F raw -b nbd://localhost:10809/$1 $2
while read line; do

[[ $line =~ .*start.:.([0-9]*).*length.:.([0-9]*).*data.:.$state.* ]]|| continue

  start=${BASH_REMATCH[1]} len=${BASH_REMATCH[2]}
  echo
  echo " $start $len:"
  qemu-io -C -c "r $start $len" -f qcow2 $2
done < <($qemu_img map --output=json $map_from)
$qemu_img rebase -u -f qcow2 -b '' $2
if test $ret = 0; then echo 'Success!'; fi
return $ret
}

The key lines here are 'qemu-io -C -c "r $start $len" -f qcow2 $2',which is performed in a loop to read just targetted portions of thedestination qcow2 file with copy-on-read set to pull in that portionfrom its backing file, and '<($qemu_img map --output=json $map_from)'which was used to derive the extent map driving which portions of thefile to read.

We also have 'qemu-img dd' that can copy subsets of a file, although itis not currently the ideal interface, and probably needs to be enhanced(I have a branch where I had tried working on patches for it, but wherethe feedback was that we want the improvements to be more generic, oreven teach 'qemu-img convert' to support offsets the way 'qemu-img dd'tries to; I'd need to revisit that branch...)


One thing which qemu-img convert can do which nbdcp could not:

  - Read or write from qcow2 files.

Although you could still couple things together: nbdcp for new featuresplus qemu-nbd to drive an NBD wrapper around qcow2 (as source or asdestination).


So instead of splitting the ecosystem and writing a new tool that
doesn't do as much as qemu-img convert, I wonder what qemu developers
think about the above missing features?  For example, are they in
scope for qemu-img convert?

I could see all of these being viable additions to qemu-img, but alsowonder if writing nbdcp would get those features available in a fastermanner.


SYNOPSIS
         nbdcp [-a|--target-allocation allocated|sparse]
               [-b|--block-list <blocksfile>]


These make sense for any qemu-img format.

               [-m|--multi-conn <n>] [-M|--multi-conn-target <n>]

These might make more sense as tunables for how to set up NBD client(destination) or server (source), rather than directly as qemu-imgoptions. That is, I could imagine that we'd use qemu-img--image-format, and then expose new blockdev-style knobs for setting upthe NBD endpoint to enable multiconn usage of that endpoint.

               [-p|--progress-bar] [-S|--sparse-detect <n>]
               [-T|--threads <n>] [-z|--target-is-zero]
               'nbd://...'|DISK.IMG 'nbd://...'|DISK.IMG


And these options also seem like they are useful to qemu-img proper.


        This program cannot: copy from file to file (use cp(1) or dd(1)), copy
        to or from formats other than raw (use qemu-img(1) convert), or access
        servers other than NBD servers (also use qemu-img(1)).

Again, depending on how we want to mix-and-match things, using qemu-nbdto create the NBD endpoint for the nbdcp source or destination may beworthwhile (which is different than directly using qemu-img); we'd wantsome decent examples of building such chains between tools. Or it couldhelp us decide whether we can cut out some overhead by consolidatingtypical uses into one tool rather than requiring convoluted chains.


        -b BLOCKSFILE
        --block-list=BLOCKSFILE
            Load the list of extents from an external file.  nbdcp considers
            this to be the truth for source extents.  The file should contain
            one record per line in the same format as nbdkit-sh-plugin(1), ie:

             offset length type

            with "offset" and "length" in bytes, and the "type" field being a
            comma-separated list of the words "hole" and "zero".  For example:

             0  1M
             1M 9M  hole,zero

Could we also teach this to parse 'qemu-img map --output=json' format?And/or add 'qemu-img map --output=XYZ' (different from the current--output=human') that gives sufficient information? (Note:--output=human is NOT suitable for extent lists - it intentionallyoutputs only the data portions, and in so doing coalesces 'hole' and'hole,zero' segments to be indistinguishable).


        -p
        --progress-bar
            Display a progress bar during copying.

        -p machine:FD
        --progress-bar=machine:FD
            Write a machine-readable progress bar to file descriptor "FD".
            This progress bar prints lines with the format "COPIED/TOTAL"
            (where "COPIED" and "TOTAL" are 64 bit unsigned integers).

Supporting optional arguments to long options is okay, but supportingoptional arguments to short options gets tricky when using getopt. Iwould recommend two separate options, '-p' with no argument as shorthandfor progress to stderr, and '-P description' with mandatory option forwhere to send progress, rather than trying to let '-p' have optionalargument.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[Prev in Thread]

Current Thread

[Next in Thread]

qemu-img convert vs writing another copy tool, Richard W.M. Jones, 2020/01/23
- Re: qemu-img convert vs writing another copy tool, Max Reitz, 2020/01/23
  - Re: qemu-img convert vs writing another copy tool, Richard W.M. Jones, 2020/01/23
    - Re: qemu-img convert vs writing another copy tool, Markus Armbruster, 2020/01/24
- Re: qemu-img convert vs writing another copy tool, Eric Blake <=
  - Re: qemu-img convert vs writing another copy tool, Richard W.M. Jones, 2020/01/24
    - Re: qemu-img convert vs writing another copy tool, Richard W.M. Jones, 2020/01/24

Prev by Date: Re: [PULL v2 00/59] Misc (x86 and QOM) patches for 2020-01-23
Next by Date: Re: [PATCH v4 81/80] m68k/q800: use memdev for RAM
Previous by thread: Re: qemu-img convert vs writing another copy tool
Next by thread: Re: qemu-img convert vs writing another copy tool
Index(es):
- Date
- Thread