bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13243: [PATCH] enhancement: modify md5sum to allow piping


From: Eric Blake
Subject: bug#13243: [PATCH] enhancement: modify md5sum to allow piping
Date: Thu, 20 Dec 2012 15:49:50 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0

tag 13243 notabug
thanks

On 12/20/2012 03:09 PM, Daniel Santos wrote:
> There are many times, usually when doing system backups, maintenance,
> recovery, etc., that I would like to pipe large files through md5sum to
> produce or verify a hash so that I do not have to read the file multiple
> times.  This is especially the case when backing up a system from a
> livecd across the network
> 
> dd if=/dev/sda3 | pbzip2 -c2 | netcat 192.168.1.123 45678
> or
> tar c /mnt/sda3 | pbzip2 -c2 | netcat 192.168.1.123 45678
> 
> Attached is a preliminary patch set that will allow for this as in the
> following example
> 
> dd if=/dev/sda3 | pbzip2 -c2 | md5sum -po /tmp/sda3.dat.bzip2.md5 |
> netcat 192.168.1.123 45678

Thanks for the report, and even for the attempted patch.  However, I'm
reluctant to even read through the patch, as I think that you can
already do what you want with existing tools.  In particular, 'info
coreutils tee' mentions:

>    The `tee' command is useful when you happen to be transferring a
> large amount of data and also want to summarize that data without
> reading it a second time.  For example, when you are downloading a DVD
> image, you often want to verify its signature or checksum right away.
> The inefficient way to do it is simply:
> 
>      wget http://example.com/some.iso && sha1sum some.iso
> 
>    One problem with the above is that it makes you wait for the
> download to complete before starting the time-consuming SHA1
> computation.  Perhaps even more importantly, the above requires reading
> the DVD image a second time (the first was from the network).
> 
>    The efficient way to do it is to interleave the download and SHA1
> computation.  Then, you'll get the checksum for free, because the
> entire process parallelizes so well:
> 
>      # slightly contrived, to demonstrate process substitution
>      wget -O - http://example.com/dvd.iso \
>        | tee >(sha1sum > dvd.sha1) > dvd.iso

In your case, you can do:

dd if=/dev/sda3 | pbzip2 -c2 | tee >(md5sum > /tmp/sda3.dat.bzip2.md5) |
 netcat 192.168.1.123 45678

Besides, isn't it nicer to use something that already works than to
worry about a preliminary patch still needing lots of work to come up to
coding standards, not to mention copyright assignment paperwork?

As such, I'm going to close the bug report so that we don't spin our
wheels re-implementing something that already works.  But you should
still feel welcome to contribute, and even add further comments to this
thread as appropriate (we can always reopen this bug if there is
convincing reason that I missed something in my decision to close it).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]