[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: diff on .gz and .bz2 files

From: jellybean stonerfish
Subject: Re: diff on .gz and .bz2 files
Date: Sun, 31 Jan 2010 07:20:54 GMT
User-agent: Pan/0.132 (Waxed in Black)

On Sat, 30 Jan 2010 20:34:05 -0600, Peng Yu wrote:

> On Sat, Jan 30, 2010 at 10:11 AM, jellybean stonerfish
> <address@hidden> wrote:
>> On Fri, 29 Jan 2010 09:20:22 -0600, Peng Yu wrote:
>>> It seems that diff can not do comparison on the decompressed files in
>>> .gz and .bz2 files. I could first decompress the .gz and .bz2 file and
>>> then do the comparison. But it would be convenient to be able to
>>> directly compare without explicitly decompressing any files. Could
>>> somebody add this feature to diff?
>> You could make a little script pretty easy.
>> $ cat /home/js/bin/gzbzdiff
>> #!/bin/bash
>> mkfifo gzi
>> mkfifo bzi
>> gunzip -c $1 > gzi &
>> bunzip2 -c $2  > bzi &
>> diff -s bzi gzi
>> rm bzi gzi
>> $ gzbzdiff nsd.gz nsd.bz2
>> Files bzi and gzi are identical
> Of course, I could. But I think this may not be an efficient way if the
> .gz file is too large, say of the order of GB.

I tried a stupider idea first.  Expanding a bzip file to stdout, piping
this to gzip, and then piping the gzipped stream to diff to compare with
a gzip file, but it always says the files are different.  

  bunzip2 -c nsd.bz2 | gzip - | diff nsd.gz -

With further looking I find that if I gzip a file twice the resulting
gz files are different. This was not what I expected.  

$ cat nsd | gzip - > nsd1.gz
$ cat nsd | gzip - > nsd2.gz
$ diff nsd1.gz nsd2.gz 
Binary files nsd1.gz and nsd2.gz differ

Yet if I gunzip the two gz files, the expanded files are identical.

$ gunzip nsd1.gz 
$ gunzip nsd2.gz 
$ diff -s nsd1 nsd2
Files nsd1 and nsd2 are identical

Maybe the gzip algorithm uses some randomness in its compression?

reply via email to

[Prev in Thread] Current Thread [Next in Thread]