zutils-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Zutils-bug] [RFE] Please consider adding support for lbzip2


From: Mikołaj Izdebski
Subject: [Zutils-bug] [RFE] Please consider adding support for lbzip2
Date: Wed, 26 Jun 2013 07:19:05 +0200

Please consider adding support for lbzip2 in Zutils.

lbzip2 [1] is a free software implementation of bzip2 which aims for
reliability and performance.  It aims to be fully compatible with
bzip2, at both file format and command-line interface levels.

It's closer to GNU philosophy than bzip2 (bzip2 is under permissive
license and uses term "open-source", while lbzip2 is under GPLv3+ and
describes itself as free software).  It is also closer technically to
GNU system (it uses GNU build system and Gnulib).

Adding support for alternative bzip2 decompressors was already
suggested [2].  I believe that using lbzip2 instead of bzip2 would be
advantageous.  In [2] you mentioned three disadvantages of using
independent bzip2 decompressors, I'll try to deny them all.

1) Portability.  lbzip2 is written in C89 and it uses GNU build system
   and Gnulib.  It works on virtually any hardware architecture and
   modern-enough system.  It even does some crazy portability things
   like limiting length of string literals to 509 characters.
   I consider lbzip2 to be more portable than Zutils.

2) Resource consumption.  Even when using single thread lbzip2 uses
   much less time (CPU power) to decompress bz2 files than bzip2 does.
   Memory usage is almost as low as bzip2.  An example benchmark is
   shown below.  lbzip2 performance scales almost linearly with number
   of processors available in the system, up to more than one
   hundred [3] -- there is only little overhead of using multiple
   threads.

3) Necessity for additional configuration.  Zutils works fine after
   replacing bzip2 with lbzip2 (substituting "bzip2" with "lbzip2" in
   decompressor_names[] in zutils.h).  There is no need for any
   additional configuration.

If you absolutely want to make sure that resource consumption remains
minimal you can force usage of a single thread by lbzip2, although I
wouldn't recommend doing that.  Users can set their individual
preferences in LBZIP2 environmental variable.

Benchmark showing performance of lbzip2 on single-processor machine
compared to bzip2 (decompressing latest Linux source tarball):

    Command being timed: "lbzcat linux-3.9.7.tar.bz2"
    User time (seconds): 13.08
    System time (seconds): 0.98
    Percent of CPU this job got: 98%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.29
    Maximum resident set size (kbytes): 6200

    Command being timed: "bzcat linux-3.9.7.tar.bz2"
    User time (seconds): 17.12
    System time (seconds): 0.18
    Percent of CPU this job got: 98%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:17.59
    Maximum resident set size (kbytes): 4060

This is not a scientific benchmark, but you are free to conduct your
own.  I'm sure the results will be similar.

I can prepare a patch if needed.

--
Mikolaj Izdebski


[1] http://github.com/kjn/lbzip2
[2] http://lists.nongnu.org/archive/html/zutils-bug/2011-02/msg00001.html
[3] http://lacos.hu/lbzip2-scaling/scaling.html



reply via email to

[Prev in Thread] Current Thread [Next in Thread]