[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#22239: bug#22357: grep -f not only huge memory usage, but also huge
From: |
Paul Eggert |
Subject: |
bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost |
Date: |
Tue, 20 Dec 2016 21:17:01 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 |
I installed the attached patches into grep master. These fix the performance
regressions noted at the start of Bug#22357. I see that the related performance
problems noted in Bug#21763 seem to be fixed too, I expect because of Norihiro
Tanaka's recent changes, so I'll boldly close both bug reports.
To some extent the attached patches restore the old behavior for grep -F, when
grep is given two or more patterns. The patch doesn't change the underlying
algorithms; it merely uses a different heuristic to decide whether to use the -F
matcher. Although I wouldn't be surprised if the attached patches hurt
performance in some cases, I didn't uncover any such cases in my performance
testing, which I admit mostly consisted of running the examples in the
abovementioned bug reports.
I'll leave Bug#22239 open, as I get the following performance figures
(user+system CPU time) for the Bug#22239 benchmark, where list.txt is created by
"aspell dump master | head -n 100000 >list.txt", and the grep commands all use
the operands "-F -f list.txt /etc/passwd" in the en_US.utf8 locale on Fedora 24
x86-64.
no -i -i grep version
0.25 0.33 2.16
0.26 10.95 2.21
0.11 2.90* current master (including attached patches)
In the C locale, the current grep master is always significantly faster than
grep 2.16 or 2.21 on the benchmark, so the only significant problem is the
number marked "*". I ran the benchmarks on an AMD Phenom II X4 910e.
0001-grep-simplify-line-counting-in-patterns.patch
Description: Text Data
0002-grep-simplify-matcher-configuration.patch
Description: Text Data
0003-grep-fix-performance-with-multiple-patterns.patch
Description: Text Data
- bug#22357: grep -f not only huge memory usage, but also huge time cost, (continued)
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Trevor Cordes, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Trevor Cordes, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, L.A. Walsh, 2016/12/15
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Jackson, 2016/12/16
bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost,
Paul Eggert <=
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/21
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Jim Meyering, 2016/12/22
- bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/23
- bug#22357: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/26
- bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Jim Meyering, 2016/12/26
- bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/26
- bug#22239: bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/27
- bug#22239: bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/28
- bug#22357: bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/28