[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 2/2] maint: use an optimal-for-grep xz compression setting
From: |
Gilles Espinasse |
Subject: |
Re: [PATCH 2/2] maint: use an optimal-for-grep xz compression setting |
Date: |
Sun, 4 Mar 2012 10:38:01 +0100 |
----- Original Message -----
From: "Jim Meyering" <address@hidden>
To: "GNU" <address@hidden>
Sent: Saturday, March 03, 2012 11:14 AM
Subject: [PATCH 2/2] maint: use an optimal-for-grep xz compression setting
...
> From 4b2224681fbc297bf585630b679d8540a02b78d3 Mon Sep 17 00:00:00 2001
> From: Jim Meyering <address@hidden>
> Date: Sat, 3 Mar 2012 10:51:11 +0100
> Subject: [PATCH 2/2] maint: use an optimal-for-grep xz compression setting
>
> * cfg.mk (XZ_OPT): Use -6e (determined empirically, see comments).
> This sacrifices a meager 60 bytes of compressed tarball size for a
> 55-MiB decrease in the memory required during decompression. I.e.,
> using -9e would shave off only 60 bytes from the tar.xz file, yet
> would force every decompression process to use 55 MiB more memory.
> ---
...
> +export XZ_OPT = -6e
> +
> old_NEWS_hash = 347e90ee0ec0489707df139ca3539934
>
-9 should be set only when the file to compress is really big enought
-6 is xz default compression setting
-6e approximately double the required time to compress (with 1% size gain)
-6{,e} work well with a file with approximately the same size as
grep-2.11.tar.
But if a bigger .tar is compressed, that may not give good compression
result.
rm -f dummy; for i in 1 2 3 4 5; do echo " $i x grep-2.11.tar size";cat
grep-2.11.tar >>dummy; xz -vv -6 < dummy >/dev/null; done; rm dummy
1 x grep-2.11.tar size
xz: Filter
chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 94 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 9 MiB of memory.
100 % 1112.9 KiB / 9240.0 KiB = 0.120 746 KiB/s 0:12
2 x grep-2.11.tar size
xz: Filter
chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 94 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 9 MiB of memory.
100 % 2130.4 KiB / 18.0 MiB = 0.115 721 KiB/s 0:25
3 x grep-2.11.tar size
xz: Filter
chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 94 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 9 MiB of memory.
100 % 3147.8 KiB / 27.1 MiB = 0.114 708 KiB/s 0:39
4 x grep-2.11.tar size
xz: Filter
chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 94 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 9 MiB of memory.
100 % 4165.2 KiB / 36.1 MiB = 0.113 709 KiB/s 0:52
5 x grep-2.11.tar size
xz: Filter
chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 94 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 9 MiB of memory.
100 % 5182.4 KiB / 45.1 MiB = 0.112 707 KiB/s 1:05
So using -6 could consider more decompression memory requirement than
compressed file size result.
Contrast that with setting dictionary size (I know this benchmark is
extreme)
Dictionary size limit is theorically the file to be compressed, here 3/4 is
fully arbitrary, only decrease memory requirement a bit.
for i in 1 2 3 4 5; do cat grep-2.11.tar >>dummy;
XZ_OPT=--lzma2=dict=$(du -h dummy | awk '{ printf "%dMiB", $1 / 4 * 3 }')
xz -vv < dummy >/dev/null; done; rm dummy
xz: Filter
chain: --lzma2=dict=6MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 75 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 7 MiB of memory.
100 % 1118.2 KiB / 9240.0 KiB = 0.121 748 KiB/s 0:12
xz: Filter
chain: --lzma2=dict=14MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 167 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 15 MiB of memory.
100 % 1114.1 KiB / 18.0 MiB = 0.060 752 KiB/s 0:24
xz: Filter
chain: --lzma2=dict=21MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 265 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 22 MiB of memory.
100 % 1115.5 KiB / 27.1 MiB = 0.040 739 KiB/s 0:37
xz: Filter
chain: --lzma2=dict=27MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 322 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 28 MiB of memory.
100 % 1116.8 KiB / 36.1 MiB = 0.030 752 KiB/s 0:49
xz: Filter
chain: --lzma2=dict=34MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 389 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 35 MiB of memory.
100 % 1118.2 KiB / 45.1 MiB = 0.024 751 KiB/s 1:01
Here adding 4 times to the load give same compressed file size (in 1%
range).
Probably tar should learn to set xz dictionary size to the size of .tar when
using -J?
That would be the most efficient way to compress without wasting memory.
Gilles