[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#22357: grep -f not only huge memory usage, but also huge time cost
From: |
Norihiro Tanaka |
Subject: |
bug#22357: grep -f not only huge memory usage, but also huge time cost |
Date: |
Thu, 22 Dec 2016 08:04:42 +0900 |
On Tue, 20 Dec 2016 21:17:01 -0800
Paul Eggert <address@hidden> wrote:
> I installed the attached patches into grep master. These fix the
> performance regressions noted at the start of Bug#22357. I see that
> the related performance problems noted in Bug#21763 seem to be fixed
> too, I expect because of Norihiro Tanaka's recent changes, so I'll
> boldly close both bug reports.
>
> To some extent the attached patches restore the old behavior for grep
> -F, when grep is given two or more patterns. The patch doesn't change
> the underlying algorithms; it merely uses a different heuristic to
> decide whether to use the -F matcher. Although I wouldn't be
> surprised if the attached patches hurt performance in some cases, I
> didn't uncover any such cases in my performance testing, which I
> admit mostly consisted of running the examples in the abovementioned
> bug reports.
>
> I'll leave Bug#22239 open, as I get the following performance figures
> (user+system CPU time) for the Bug#22239 benchmark, where list.txt is
> created by "aspell dump master | head -n 100000 >list.txt", and the
> grep commands all use the operands "-F -f list.txt /etc/passwd" in
> the en_US.utf8 locale on Fedora 24 x86-64.
>
> no -i -i grep version
> 0.25 0.33 2.16
> 0.26 10.95 2.21
> 0.11 2.90* current master (including attached patches)
>
> In the C locale, the current grep master is always significantly
> faster than grep 2.16 or 2.21 on the benchmark, so the only
> significant problem is the number marked "*". I ran the benchmarks on
> an AMD Phenom II X4 910e.
Thanks.
BTW, are you aware of extreme slowdown in the following cases after
third patch?
yes $(printf %040d 0) | head -10000000 >inp
printf '0\n1\n' >pat
env LC_ALL=C src/grep -w -f pat inp
- bug#22357: grep -f not only huge memory usage, but also huge time cost, (continued)
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Trevor Cordes, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Trevor Cordes, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, L.A. Walsh, 2016/12/15
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Jackson, 2016/12/16
bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/21
- bug#22357: grep -f not only huge memory usage, but also huge time cost,
Norihiro Tanaka <=
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Jim Meyering, 2016/12/22
- bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/23
- bug#22357: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/26
- bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Jim Meyering, 2016/12/26
- bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/26
- bug#22239: bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/27
- bug#22239: bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/28
- bug#22357: bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/28