lzip-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lzip-bug] Tarball indexing and plzip


From: Antonio Diaz Diaz
Subject: Re: [Lzip-bug] Tarball indexing and plzip
Date: Sun, 10 Mar 2019 16:53:54 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14

Hello Dennis,

Dennis Katsonis wrote:
I was wondering whether it would be difficult or not, to add
functionality to plzip, or create a variant of it, which had tarball
indexing capabilities like pixz.

I am in the process of implementing something like that, and more, but in tarlz, not in plzip: http://www.nongnu.org/lzip/tarlz.html


Pixz allows a more random access to the compressed tarball.  Listing is
very quick, and even extracting a file at the end of a large tarball is
quite fast, not too much slower than extracting it from an uncompressed,
indexed tarball.  A major advantage when extracting select files from an
archived compressed tarball.

Tarlz is not complete yet, but it can already list pretty quick if the archive is created with the right options[1]. Parallel extraction should be similarly quick once it is implemented.

http://www.nongnu.org/lzip/manual/tarlz_manual.html#Multi_002dthreaded-tar

If the files in the archive are large, multi-threaded '--list' on a regular (seekable) tar.lz archive can be hundreds of times faster than sequential '--list' because, in addition to using several processors, it only needs to decompress part of each lzip member. See the following example listing the Silesia corpus on a dual core machine:

     tarlz -9 --no-solid -cf silesia.tar.lz silesia
     time lzip -cd silesia.tar.lz | tar -tf -            (5.032s)
     time plzip -cd silesia.tar.lz | tar -tf -           (3.256s)
     time tarlz -tf silesia.tar.lz                       (0.020s)


I expect that tarlz, or something based on the same principles, will obsolete conventionally compressed tar archives.


Best regards,
Antonio.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]