[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#48680: "zgrep -q" failing with some large files
From: |
David Yoder |
Subject: |
bug#48680: "zgrep -q" failing with some large files |
Date: |
Wed, 26 May 2021 18:02:25 +0000 |
I've run into a problem with zgrep -q. On some large bz2 compressed files it
returns false/error for a search that should have returned true. bzgrep and
"bzip2 -cd | grep -q" both work as expected.
Either "-q or -l" are required to show the problem. I suspect that grep is
terminating at the first match and sending SIGPIPE to bzgrep. But I don't know
why the behavior is different in zgrep and "bzip2 -cd <file> | grep -q".
There is some minimum size for the compressed file to show the problem. The
attached file is the shortest file with which I could duplicate the problem.
It is 1024 lines of this bzip2'ed:
The quick brown fox jumped over the lazy dog. The quick brown fox jumped over
the lazy dog. The quick brown fox jumped over the lazy dog. The quick brown fox
jumped over the lazy dog. The quick brown fox jumped over the lazy dog. The
quick brown fox jumped over the lazy dog. The quick brown fox jumped over the
lazy dog. The quick brown fox jumped over the lazy dog.
Here is a failing zgrep that should have returned true (0) but instead returns
false (141).
>zgrep -q fox synthetic.log.bz2 ; echo $?
141
vl-dyoder-ecdca:~>zgrep --version
zgrep (gzip) 1.10
Copyright (C) 2010-2018 Free Software Foundation, Inc.
This is free software. You may redistribute copies of it under the terms of
the GNU General Public License https://www.gnu.org/licenses/gpl.html.
There is NO WARRANTY, to the extent permitted by law.
Written by Jean-loup Gailly.
This also fails when used within find and prints a bit more information about
the signal, SIGPIPE:
>find . -maxdepth 1 -name synthetic.log.bz2 -exec zgrep -q fox {} \; -print
find: 'zgrep' terminated by signal 13
Here are two examples showing expected behavior. Both bzgrep and "bzip2 -cd |
grep -q" return true (0).
>bzgrep -q fox synthetic.log.bz2 ; echo $?
0
>bzip2 -cd synthetic.log.bz2 | grep -q fox; echo $?
0
Also, the identical file either uncompressed or compressed with gzip works as
expected with zgrep:
>zgrep -q fox synthetic.log; echo $?
0
>zgrep -q fox synthetic.log.gz; echo $?
0
Zgrep seems to use a more complicated version of "bzip2 -cd <file> | grep",
which works as expected. So perhaps the rather complicated pipe operations in
zgrep are related. If so perhaps the shell I'm using matters:
>bash --version
GNU bash, version 4.3.48(1)-release (x86_64-suse-linux-gnu)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
synthetic.log.bz2
Description: synthetic.log.bz2
- bug#48680: "zgrep -q" failing with some large files,
David Yoder <=