[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
From: |
Young Mo Kang |
Subject: |
bug#67022: Gzip decompression can be 60% faster using zlib's CRC32 |
Date: |
Thu, 9 Nov 2023 12:40:24 -0500 |
User-agent: |
Mozilla Thunderbird |
Hello,
I have noticed that GNU Gzip's CRC32 calculation is the main bottleneck
in decompression, and it can run significantly faster >60% if we replace
it with crc32 function from zlib.
I tested decompression speed of linux source code tar.gz file before and
after replacing CRC32 computation. On an AMD 7735HS system, I get
GNU Gzip unmodified
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:05.11
GNU Gzip with CRC32 from zlib
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.16
And I saw even better performance improvement when tested on an Apple
Silicon M1 system.
GNU Gzip unmodified
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.83
GNU Gzip with CRC32 from zlib
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.72
Since both GNU Gzip and zlib are written by the same authors, I was
wondering if GNU Gzip can share zlib's CRC32 calculation and obtain this
performance gain--I am not sure if there would be a license issue though.
The following bash script should reproduce the result
```
# download GNU Gzip and zlib
wget -O- https://ftp.gnu.org/gnu/gzip/gzip-1.13.tar.gz | tar xzf -
wget -O- https://zlib.net/zlib-1.3.tar.gz | tar xzf -
# download linux source code as a test file for decompression speed
wget -O- https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.6.1.tar.xz
| xz -d | gzip > linux.tar.gz
# compile zlib
cd zlib-1.3
CFLAGS="-O2 -g" ./configure --static && make -j
cd ..
# compile GNU Gzip
cd gzip-1.13
CFLAGS="-O2 -g" ./configure && make -j
# measure decompression speed
/usr/bin/time -v ./gzip -d < ../linux.tar.gz > linux.tar 2> ../gzip1.time
# use crc32 from zlib
cat > util.diff << EOF
@@ -27,6 +27,7 @@
#include <stdlib.h>
#include <errno.h>
+#include "crc32.h"
#include "tailor.h"
#include "gzip.h"
#include <dirname.h>
@@ -136,25 +137,14 @@ copy (int in, int out)
ulg
updcrc (uch const *s, unsigned n)
{
- register ulg c; /* temporary variable */
-
- if (s == NULL) {
- c = 0xffffffffL;
- } else {
- c = crc;
- if (n) do {
- c = crc_32_tab[((int)c ^ (*s++)) & 0xff] ^ (c >> 8);
- } while (--n);
- }
- crc = c;
- return c ^ 0xffffffffL; /* (instead of ~c for 64-bit machines) */
+ crc = crc32(crc, s, n);
}
/* Return a current CRC value. */
ulg
getcrc ()
{
- return crc ^ 0xffffffffL;
+ return crc;
}
#ifdef IBM_Z_DFLTCC
EOF
patch < util.diff util.c
# create header file
cat > crc32.h << EOF
#pragma once
unsigned long crc32(unsigned long crc, const unsigned char *buf,
unsigned int len);
EOF
# copy crc32 object file from zlib
cp ../zlib-1.3/crc32.o .
# re-compile GNU Gzip
gcc -O2 -g -c util.c -Ilib
gcc -O2 -g *.o lib/libgzip.a -o gzip
# measure decompression speed
/usr/bin/time -v ./gzip -d < ../linux.tar.gz > linux.tar 2> ../gzip2.time
# print out time difference
cd ..
echo
echo "GNU Gzip unmodified"
grep Elapsed gzip1.time
echo "GNU Gzip with CRC32 from zlib"
grep Elapsed gzip2.time
```
- bug#67022: Gzip decompression can be 60% faster using zlib's CRC32,
Young Mo Kang <=