bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Compression Comparison


From: Bob Proulx
Subject: Re: Compression Comparison
Date: Mon, 3 Dec 2007 17:24:19 -0700
User-agent: Mutt/1.5.13 (2006-08-11)

Jim Meyering wrote:
> I'm surprised you'd compare in such a pessimistic manner.

The obvious conclusion is that I am a pessimist! :-)  More seriously
though I think that is the proper way to compare the data.  We are
talking about how well compression works which means we have to take
into consideration the data that is being compressed.

> I look at it differently:
> 
> compare download time:
>   going from gzip to lzma, I see a speed-up of 2.39
>   going from bzip2 to lzma, I see a speed-up of 1.55

But when talking about times that are less than a minute is it
significant?  Not to me.  I am sure that my bandwidth wasted to spam
swamps it.

> compare disk usage:
>   going from gzip to lzma, I can store more than twice as much (2.39x) data
>   going from bzip2 to lzma, I can store 55% more data

A 100G disk could hold 10,000 copies of 10M projects.  Compressed
source code projects are not usually what fills up the typical disk.
Being able to store 2.39x more projects means 24,000 copies.  A 100G
disk is now small.  Being able to buy a 500G disk for $100 means being
able to store 50,000 copies.  Combined is 120,000 copies.  "Enough is
equal to a feast."  I won't be disk space limited due to the
relatively small difference in compressed distribution files.

Alternatively a git clone contains the complete history (what was
converted to it) and rests at 53M.  In terms of efficiency it can
reproduce any version in the history very efficiently.

Comparing lzma to gzip does not produce the huge benefit that is
obtained when comparing, for example, git to cvs.  That is clearly so
much improved that it is worth the cost associated with it to move to
it.

I am not unhappy with lzma.  It seems quite reasonable.  I am happy to
use it.  I am just providing a counterpoint to balance some of the
discussion.  But I am not wanting to be too quarrelsome about this
(just a little bit but not too much) so I will quiet down about
it. :-)

Bob




reply via email to

[Prev in Thread] Current Thread [Next in Thread]