[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lzip-bug] lzip vs. zstd

From: Antonio Diaz Diaz
Subject: Re: [Lzip-bug] lzip vs. zstd
Date: Fri, 14 Oct 2016 17:37:18 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv: Gecko/20110420 SeaMonkey/2.0.14

Hi Aruna,

address@hidden wrote:
do you already have numbers, opinions and maybe a comparison in
reliability, speed, compression ratio etc. against the new zstd?

Testing zstd is in my TODO list. My first impression is that it may perhaps be well suited for the needs of Facebook, but after a glimpse at its format specification I wouldn't touch it with a 3 meter pole for long-term archiving. The specification says:

"The format uses the Zstandard compression method,
and optional [xxHash-64 checksum method](http://www.xxhash.org),
for detection of data corruption.
A compliant decompressor must be able to decompress
at least one working set of parameters
that conforms to the specifications presented here.
It may also ignore informative fields, such as checksum."

Note that:
1) It is a fragmented format; having a decompressor does not guarantee that you'll be able to decode a given file.

2) The checksum is optional (and the decompressor may ignore it even if it is present), but the flag indicating the presence of the checksum does not seem to be protected[1].

[1] http://www.nongnu.org/lzip/xz_inadequate.html#unprot_len

It seems this is a competitor for xz, lzip and friends?

I don't think so. Zstd does not seem able to beat LZMA in compression ratio. For the silesia corpus the compressed size of 'zstd -9' (60414774) is between those of 'lzip -1' (61358052) and 'lzip -2' (58938275).

'zstd --ultra -22' reduces the size to 52750811 bytes in 5m17s, still far from the 48314899 bytes achieved by 'lzip -9' in 4m33s.

(As a side note, the speeds announced for zstd are surpassed by plzip[2] for large files on multiprocessor machines).

[2] http://www.nongnu.org/lzip/plzip_benchmark.html

BTW, I like to see free software in terms of collaboration rather than competition. I develop lzip because IMHO it is the best format for some uses. I wouldn't have developed it just to "compete". Keeping the number of incompatible formats to a minimum is better for all, and it seems that the existing formats already cover all the ratio/speed spectrum of zstd and beyond.

Best regards,

reply via email to

[Prev in Thread] Current Thread [Next in Thread]