bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#44754: Extreme performance degradation in GNU grep 3.4 / 3.6


From: Frank Heckenbach
Subject: bug#44754: Extreme performance degradation in GNU grep 3.4 / 3.6
Date: Fri, 20 Nov 2020 06:20:55 +0100

Hi,

I have a use case where I run grep with a large number of search
patterns on a large text file. It works well with grep-3.3, but with
grep-3.4 it quickly burned through GBs of memory and almost locked
up my system due to swapping.

To avoid attaching those large files, I could mostly reproduce the
effects like this:

ulimit -d 5000000  # avoid system lockup due to excessive swapping
export LC_ALL=C    # make sure no Unicode case conversions are needed

% time  ./grep-3.3 -Fwif <(seq 300000 | tr 0-9 A-J) <<<foo

real    0m0.054s
user    0m0.048s
sys     0m0.012s

% time  ./grep-3.4 -Fwif <(seq 30000 | tr 0-9 A-J) <<<foo
./grep-3.4: Memory exhausted
Aborted

real    0m1.291s
user    0m0.696s
sys     0m0.599s

% time  ./grep-3.6 -Fwif <(seq 300000 | tr 0-9 A-J) <<<foo

real    0m13.162s
user    0m12.955s
sys     0m0.211s

grep-3.3 behaves well, even with much larger number of patterns.
Time seems to grow linearly, and memory usage is constant.

grep-3.4 behaves the worst of these 3 versions. Even with just 30000
patterns it exceeds the ulimit of 5 GB.

grep-3.6 behaves a bit better than 3.4, but still bad. Time seems to
be quadratic in the number of patterns, and though memory usage in
this case seems to be almost constant, in my actual use case it also
runs out of memory where grep-3.3 works well with just a few 100 MB
used.

Without "-i", grep-3.4 seems to run as fast as grep-3.3, but
grep-3.6 is almost as slow as with "-i".

So there might actually be two different issues here, one that
affects 3.4 with "-i" and one that affects 3.6 with or without "-i".

Regards,
Frank





reply via email to

[Prev in Thread] Current Thread [Next in Thread]