[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lzip-bug] lzip memory & performance issue
From: |
Antonio Diaz Diaz |
Subject: |
Re: [Lzip-bug] lzip memory & performance issue |
Date: |
Thu, 03 Mar 2011 21:11:43 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.11) Gecko/20050905 |
address@hidden wrote:
Regarding the performance difference between lzip and xz (at least on my
system), reducing --match-length did speed up lzip somewhat. With
--match-length=32, lzip took about 13:50 to compress the 633169920-byte
file mentioned in my previous message.
From http://www.nongnu.org/lzip/manual/lzip_manual.html
--------------------------------------------------------------------
`--match-length=length'
`-m length'
Set the match length limit in bytes. After a match this long is
found, the search is finished. Valid values range from 5 to 273. Larger
values usually give better compression ratios but longer compression times.
--------------------------------------------------------------------
But the differences you are reporting are too big. I think you have a
problem with your compiler, compiler options, or test file.
That's down from about 19 minutes with the (default?) --match-length=273,
Also taken from the manual:
--------------------------------------------------------------------
`-0 .. -9'
Set the compression parameters (dictionary size and match length
limit) as shown in the table below. Note that `-9' can be much slower
than `-0'. These options have no effect when decompressing.
The bidimensional parameter space of LZMA can't be mapped to a
linear scale optimal for all files. If your files are large, very
repetitive, etc, you may need to use the `--match-length' and
`--dictionary-size' options directly to achieve optimal performance.
Level Dictionary size Match length limit
-0 64 KiB 16 bytes
-1 1 MiB 5 bytes
-2 1.5 MiB 6 bytes
-3 2 MiB 8 bytes
-4 3 MiB 12 bytes
-5 4 MiB 20 bytes
-6 8 MiB 36 bytes
-7 16 MiB 68 bytes
-8 24 MiB 132 bytes
-9 32 MiB 273 bytes
--------------------------------------------------------------------
So 273 isn't necessarily optimal for all files.
You are right. This is why a `--match-length' option is provided.
Interestingly, xz -9e uses a default match length (it calls the option
"nice") of 64. But for some reason the xz-compressed file was still
smaller than the lzip one.
The reason is the 'e', which activates the "extreme" mode in xz. I think
it uses a different matchfinder in extreme mode.
You are also using "lzma2", which I think can detect uncompressed areas
and copy them without compressing, saving some output size.
Certainly xz has a lot more switches to play with than lzip, which may
perhaps allow it to compress more a given file, if you have the time to
play with it.
Lzip OTOH is a simple general-purpose compressor just as gzip or bzip2.
I try to optimize it for a wide range of files with minimun need for
user intervention. Did you notice that gzip and bzip2 have even fewer
compression options than lzip?
Regards,
Antonio.