qemu-img convert vs writing another copy tool

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

qemu-img convert vs writing another copy tool

From:	Richard W.M. Jones
Subject:	qemu-img convert vs writing another copy tool
Date:	Thu, 23 Jan 2020 18:35:00 +0000
User-agent:	Mutt/1.5.21 (2010-09-15)

I guess some people are aware that virt-v2v, which is a tool which
converts guests from VMware to run on KVM, and some other
OpenStack-OpenStack migration tools we have, use "qemu-img convert" to
copy the data around.

Historically we've had bugs here.  The most recent was discussed in
the thread on this list called "Bug? qemu-img convert to preallocated
image makes it sparse"
(https://www.mail-archive.com/address@hidden/msg60479.html)

We've been kicking around the idea of writing some alternate tool.  My
proposal would be a tool (not yet written, maybe it will never be
written) called nbdcp for copying between NBD servers and local files.
An outline manual page for this proposed tool is attached.

Some of the things which this tool might do which qemu-img convert
cannot do right now:

 - Hint that the target already contains zeroes.  It's almost always
   the case that we know this, but we cannot tell qemu.  This was the
   cause of a big performance regression last year.

 - Declare that we want the target to be either sparse or
   preallocated.  qemu-img convert can sort of do this in a
   round-about way (create the target in advance and use the -n
   option), but also it's broken at the moment.

 - NBD multi-conn.  In my tests this makes a really massive
   performance difference in certain situations.  Again, virt-v2v has
   a lot of information that we cannot pass to qemu: we know, for
   example, exactly if the server supports the feature, how many
   threads are available, in some situations even have information
   about the network and backing disks that the data will travel over
   / be stored on.

 - Machine-parsable progress bars.  You can, sort of, parse the
   progress bar from qemu-img convert, but it's not as easy as it
   could be.  In particular it would be nice if the format was treated
   as ABI, and if there was a way to have the tool write the progress
   bar info to a precreated file descriptor.

 - External block lists.  This is a rather obscure requirement, but
   it's necessary in the case where we can get the allocated block map
   from another source (eg. pyvmomi) and then want to use that with an
   NBD source that does not support extents (eg. nbdkit-ssh-plugin /
   libssh / sftp).  [Having said that, it may be possible to implement
   this as an nbdkit filter, so maybe this is not a blocking feature.]

One thing which qemu-img convert can do which nbdcp could not:

 - Read or write from qcow2 files.

So instead of splitting the ecosystem and writing a new tool that
doesn't do as much as qemu-img convert, I wonder what qemu developers
think about the above missing features?  For example, are they in
scope for qemu-img convert?

Rich.



----------------------------------------------------------------------

nbdcp(1)                            LIBNBD                            nbdcp(1)

NAME
       nbdcp - copy between NBD servers and local files

SYNOPSIS
        nbdcp [-a|--target-allocation allocated|sparse]
              [-b|--block-list <blocksfile>]
              [-m|--multi-conn <n>] [-M|--multi-conn-target <n>]
              [-p|--progress-bar] [-S|--sparse-detect <n>]
              [-T|--threads <n>] [-z|--target-is-zero]
              'nbd://...'|DISK.IMG 'nbd://...'|DISK.IMG

DESCRIPTION
       nbdcp is a utility that can copy quickly between NBD servers and local
       raw format files (or block devices).  It can copy:

       from NBD server to file (or block device)
           For example, this command copies from the NBD server listening on
           port 10809 on "example.com" to a local file called disk.img:

            nbdcp nbd://example.com disk.img

       from file (or block device) to NBD server
           For example, this command copies from a local block device /dev/sda
           to the NBD server listening on Unix domain socket /tmp/socket:

            nbdcp /dev/sda 'nbd+unix:///?socket=/tmp/socket'

       from NBD server to NBD server
           For example this copies between two different exports on the same
           NBD server:

            nbdcp nbd://example.com/export1 nbd://example.com/export2

       This program cannot: copy from file to file (use cp(1) or dd(1)), copy
       to or from formats other than raw (use qemu-img(1) convert), or access
       servers other than NBD servers (also use qemu-img(1)).

       NBD servers are specified by their URI, following the NBD URI standard
       at https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md

   Controlling sparseness or preallocation in the target
       The options -a (--target-allocation), -S (--sparse-detect) and -z
       (--target-is-zero) together control sparseness in the target file.

       By default nbdcp tries to both preserve sparseness from the source and
       will detect runs of allocated zeroes and turn them into sparseness.  To
       turn off detection of sparseness use "-S 0".

       The -z option should be used if and only if you know that the target
       block device is zeroed already.  This allows an important optimization
       where nbdcp can skip zeroing or trimming parts of the disk that are
       already zero.

       The -a option is used to control the desired final preallocation state
       of the target.  The default is "-a sparse" which makes the target as
       sparse as possible.  "-a allocated" makes the target fully allocated.

OPTIONS
       --help
           Display brief command line help and exit.

       -a allocated
       --target-allocation=allocated
           Make the target fully allocated.

       -a sparse
       --target-allocation=sparse
           Make the target as sparse as possible.  This is the default.  See
           also "Controlling sparseness or preallocation in the target".

       -b BLOCKSFILE
       --block-list=BLOCKSFILE
           Load the list of extents from an external file.  nbdcp considers
           this to be the truth for source extents.  The file should contain
           one record per line in the same format as nbdkit-sh-plugin(1), ie:

            offset length type

           with "offset" and "length" in bytes, and the "type" field being a
           comma-separated list of the words "hole" and "zero".  For example:

            0  1M
            1M 9M  hole,zero

           Any parts of the source which don't have descriptions are assumed
           to be of type "hole,zero".

       -m N
       --multi-conn=N
           Enable NBD multi-conn with up to "N" connections.  Only some NBD
           servers support this but it can greatly improve performance.

           The default is to enable multi-conn if we detect that the server
           supports it, with up to 4 connections.

       -M N
       --multi-conn-target=N
           If you are copying between NBD servers, use -m to control the
           multi-conn setting for the source server, and this option (-M) to
           control the multi-conn setting for the target server.

       -p
       --progress-bar
           Display a progress bar during copying.

       -p machine:FD
       --progress-bar=machine:FD
           Write a machine-readable progress bar to file descriptor "FD".
           This progress bar prints lines with the format "COPIED/TOTAL"
           (where "COPIED" and "TOTAL" are 64 bit unsigned integers).

       -S 0
       --sparse-detect=0
           Turn off sparseness detection.

       -S N
       --sparse-detect=N
           Detect runs of zero bytes of at least size "N" bytes and turn them
           into sparse blocks on the target (if "-a sparse" is used).  This is
           the default, with a 512 byte block size.

       -T N
       --threads N
           Use at most "N" threads when copying.  Usually more threads leads
           to better performance, up to the limit of the number of cores on
           your machine and the parallelism of the underlying disk or network.
           The default is to use the number of online processors.

       -z
       --target-is-zero
           Declare that the target block device contains only zero bytes (or
           sparseness that reads back as zeroes).  You must only use this
           option if you are sure that this is true, since it means that nbdcp
           will enable an optimization where it skips zeroing parts of the
           disk that are zero on the source.

       -V
       --version
           Display the package name and version and exit.

SEE ALSO
       qemu-img(1), libnbd(3), nbdsh(1).

AUTHORS
       Richard W.M. Jones

COPYRIGHT
       Copyright (C) 2020 Red Hat Inc.

LICENSE
       This library is free software; you can redistribute it and/or modify it
       under the terms of the GNU Lesser General Public License as published
       by the Free Software Foundation; either version 2 of the License, or
       (at your option) any later version.

       This library is distributed in the hope that it will be useful, but
       WITHOUT ANY WARRANTY; without even the implied warranty of
       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
       Lesser General Public License for more details.

       You should have received a copy of the GNU Lesser General Public
       License along with this library; if not, write to the Free Software
       Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
       02110-1301 USA

libnbd-1.3.1                      2020-01-23                          nbdcp(1)


-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

[Prev in Thread]

Current Thread

[Next in Thread]

qemu-img convert vs writing another copy tool, Richard W.M. Jones <=
- Re: qemu-img convert vs writing another copy tool, Max Reitz, 2020/01/23
  - Re: qemu-img convert vs writing another copy tool, Richard W.M. Jones, 2020/01/23
    - Re: qemu-img convert vs writing another copy tool, Markus Armbruster, 2020/01/24
- Re: qemu-img convert vs writing another copy tool, Eric Blake, 2020/01/23
  - Re: qemu-img convert vs writing another copy tool, Richard W.M. Jones, 2020/01/24
    - Re: qemu-img convert vs writing another copy tool, Richard W.M. Jones, 2020/01/24

Prev by Date: Re: [PATCH 0/6] Fix more GCC9 -O3 warnings
Next by Date: Re: qemu-img convert vs writing another copy tool
Previous by thread: [PATCH] iscsi: Don't access non-existent scsi_lba_status_descriptor
Next by thread: Re: qemu-img convert vs writing another copy tool
Index(es):
- Date
- Thread