[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lzip-bug] lzip memory & performance issue

From: Antonio Diaz Diaz
Subject: Re: [Lzip-bug] lzip memory & performance issue
Date: Thu, 03 Mar 2011 21:11:43 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.11) Gecko/20050905

address@hidden wrote:
Regarding the performance difference between lzip and xz (at least on my
system), reducing --match-length did speed up lzip somewhat. With
--match-length=32, lzip took about 13:50 to compress the 633169920-byte
file mentioned in my previous message.

From http://www.nongnu.org/lzip/manual/lzip_manual.html
`-m length'
Set the match length limit in bytes. After a match this long is found, the search is finished. Valid values range from 5 to 273. Larger values usually give better compression ratios but longer compression times.

But the differences you are reporting are too big. I think you have a problem with your compiler, compiler options, or test file.

That's down from about 19 minutes with the (default?) --match-length=273,

Also taken from the manual:
`-0 .. -9'
Set the compression parameters (dictionary size and match length limit) as shown in the table below. Note that `-9' can be much slower than `-0'. These options have no effect when decompressing.

The bidimensional parameter space of LZMA can't be mapped to a linear scale optimal for all files. If your files are large, very repetitive, etc, you may need to use the `--match-length' and `--dictionary-size' options directly to achieve optimal performance.

    Level       Dictionary size         Match length limit
    -0  64 KiB  16 bytes
    -1  1 MiB   5 bytes
    -2  1.5 MiB         6 bytes
    -3  2 MiB   8 bytes
    -4  3 MiB   12 bytes
    -5  4 MiB   20 bytes
    -6  8 MiB   36 bytes
    -7  16 MiB  68 bytes
    -8  24 MiB  132 bytes
    -9  32 MiB  273 bytes

So 273 isn't necessarily optimal for all files.

You are right. This is why a `--match-length' option is provided.

Interestingly, xz -9e uses a default match length (it calls the option
"nice") of 64. But for some reason the xz-compressed file was still
smaller than the lzip one.

The reason is the 'e', which activates the "extreme" mode in xz. I think it uses a different matchfinder in extreme mode.

You are also using "lzma2", which I think can detect uncompressed areas and copy them without compressing, saving some output size.

Certainly xz has a lot more switches to play with than lzip, which may perhaps allow it to compress more a given file, if you have the time to play with it.

Lzip OTOH is a simple general-purpose compressor just as gzip or bzip2. I try to optimize it for a wide range of files with minimun need for user intervention. Did you notice that gzip and bzip2 have even fewer compression options than lzip?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]