help-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-tar] tar patch to selectively divert named compressors to alte


From: ERSEK Laszlo
Subject: Re: [Help-tar] tar patch to selectively divert named compressors to alternatives
Date: Thu, 8 Oct 2009 21:56:33 +0200 (CEST)

On Thu, 8 Oct 2009, Karl Berry wrote:

   I'd rather provide compilation time options for that. Variants are:

   1. ./configure 
--with-compressor=bzip2=/etc/alternatives/bzip2-filter,gzip=...
   2. ./configure --with-bzip2=/etc/alternatives/bzip2-filter --with-gzip=...
   3. ./configure BZIP2_PROGRAM=/etc/alternatives/bzip2-filter GZIP_PROGRAM=...

All of this is only being contemplated because of lbzip?  No one has
ever wanted "alternative" implementations of any other compressor,

No user has ever wanted a decompressor that would exercise all four cores of his new quad-core computer when extracting the 300M openoffice tarball, staring at the cpu load desktop applet and cursing about three quarters of his CPU idling? No system administrator has ever wanted a bzip2 compressor to compress his/her daily 7G of logs in 1/16th time on his/her 16 core server?

For a long time, bzip2 was the most space-efficient main-stream, free software compressor, but it was relatively slow. Thus any speedup was useful. *Lots* of parallel bzip2 compressors and some decompressors were written before lbzip2, but in my interpretation, they never got the multi-threaded decompression quite right. See [0] if you care.

And what did Mark Adler write pigz for, then? Why didn't he just extend gzip? He's a big name, maybe you'll accept from him that parallelism by way of multi-threading is not an additive property, you cannot just slap it on a pre-existing bitstream format.

Why did Tim Cook write tamp?

I'm not pushing for lbzip2 to be integrated better with tar, or at least not for myself. I asked first when I was asked to ask. I'm perfectly fine with --use and I would be fine even without it.


So how about working (Sergey, I don't mean you) on making lbzip an
actual replacement for bzip?

If you mean me, I won't work on that, sorry. Please anybody feel free to fork lbzip2. Nice move though trying to allocate my time. I worked my ass of on lbzip2 after my day job, nights till 4 o'clock in the morning till I was falling out of my chair, running kilometers in my room like a rat in a maze thinking about the decompressor design. Don't care if it doesn't show, if it's "trivial" or "convoluted" for some (yeah, why didn't anybody implement it before, among the 3 or so pre-existing parallelizations?) Lbip2 is done for me. I have some experiments in the queue if some really nice people will help me out, I'll document those experiments, maybe lose face on them, I'll release 1.00 then just abandon it.


I know that isn't being done now, but that doesn't it couldn't be done. Free software and all that.

Sure. I'm not interested in toiling away on that shit for a year. You get my code and the four freedoms with it, you don't get my time and effort. I contacted Julian Seward, author of original bzip2, both when I had questions about bzip2 itself and when the idea first emerged to extend bzip2 with multi-threading, for example by merging lbzip2 into it. He didn't seem interested. So what? Do it yourself, free software and all that. Everybody will be thankful, me included, if you manage to do that.


Finally, does lbzip offer any advantages over xz
(http://tukaani.org/xz)?  Which already compresses better and
decompresses faster than bz2, in general.

I looked at their 1.0 file format when it came out and it was great, the blocks are length-prefixed which makes it obvious that it was designed with parallelism in mind. (The format is great for a whole series of other reasons, too.) Once they get the multi-threading done, I'll be the first to throw away lbzip2, use xz for efficiency-oriented compression and tamp (multi-threaded QuickLZ) for speed-oriented compression. In the meantime, there are *lots* of single-stream tar.bz2's on the net, and there are people (I am for sure) issuing

$ wget -O - URL | tee -i f.tar.bz2 | tar -x --use=lbzip2


Just forget the damn thing, I'm fucking tired of it. Defending lbzip2 like it was some crusade of mine, sending around the same old links a thousand times, identifying the use cases where lbzip2 is "unique", proving I'm not doing this only for "fame" or some shit like that. Forget it.

lacos

[0] http://lists.debian.org/debian-mentors/2009/02/msg00135.html




reply via email to

[Prev in Thread] Current Thread [Next in Thread]