--- Begin Message ---
Subject: |
[PATCH] grep: a kwset matcher not work in a grep matcher |
Date: |
Sat, 23 Mar 2019 08:06:35 +0900 |
A kwset matcher is not built in a grep matcher after token re-order is
introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0c98 in dfa.
It caused performance degradation in some typical cases. This bug is
introduced in grep-3.2.
DFAMUST() does not work if tokens which are parsed in dfa matcher are
re-ordered. Therefore, change as it is called after parse and before
tokens re-order.
BTW, this change does not affect programs that do not use DFAMUST(),
such as sed or gawk.
$ yes $(printf '%040d' 0) | head -10000000 >inp
$ grep-2.2/src/grep 01.2 inp
real 1.61
user 1.53
sys 0.07
$ grep-2.3/src/grep 01.2 inp
real 1.57
user 1.48
sys 0.08
$ grep-2.4/src/grep 01.2 inp
real 1.50
user 1.44
sys 0.05
$ grep-2.4.1/src/grep 01.2 inp
real 1.53
user 1.48
sys 0.05
$ grep-2.4.2/src/grep 01.2 inp
real 1.52
user 1.47
sys 0.04
$ grep-2.5.4/src/grep 01.2 inp
real 1.53
user 1.47
sys 0.05
$ grep-2.6/src/grep 01.2 inp
real 1.51
user 1.47
sys 0.04
$ grep-2.6.1/src/grep 01.2 inp
real 1.50
user 1.44
sys 0.05
$ grep-2.6.2/src/grep 01.2 inp
real 1.52
user 1.46
sys 0.05
$ grep-2.6.3/src/grep 01.2 inp
real 1.52
user 1.47
sys 0.05
$ grep-2.7/src/grep 01.2 inp
real 1.53
user 1.49
sys 0.04
$ grep-2.8/src/grep 01.2 inp
real 1.52
user 1.46
sys 0.05
$ grep-2.9/src/grep 01.2 inp
real 1.54
user 1.50
sys 0.04
$ grep-2.10/src/grep 01.2 inp
real 1.51
user 1.46
sys 0.05
$ grep-2.11/src/grep 01.2 inp
real 1.53
user 1.48
sys 0.05
$ grep-2.12/src/grep 01.2 inp
real 1.51
user 1.47
sys 0.03
$ grep-2.13/src/grep 01.2 inp
real 1.52
user 1.47
sys 0.03
$ grep-2.14/src/grep 01.2 inp
real 1.52
user 1.47
sys 0.04
$ grep-2.15/src/grep 01.2 inp
real 1.55
user 1.49
sys 0.05
$ grep-2.16/src/grep 01.2 inp
real 1.53
user 1.48
sys 0.04
$ grep-2.17/src/grep 01.2 inp
real 1.53
user 1.48
sys 0.05
$ grep-2.18/src/grep 01.2 inp
real 1.51
user 1.44
sys 0.06
$ grep-2.19/src/grep 01.2 inp
real 0.06
user 0.02
sys 0.04
$ grep-2.20/src/grep 01.2 inp
real 0.07
user 0.01
sys 0.05
$ grep-2.21/src/grep 01.2 inp
real 0.06
user 0.02
sys 0.04
$ grep-2.22/src/grep 01.2 inp
real 0.06
user 0.01
sys 0.05
$ grep-2.23/src/grep 01.2 inp
real 0.09
user 0.04
sys 0.05
$ grep-2.24/src/grep 01.2 inp
real 0.09
user 0.04
sys 0.04
$ grep-2.25/src/grep 01.2 inp
real 0.09
user 0.05
sys 0.04
$ grep-2.26/src/grep 01.2 inp
real 0.09
user 0.04
sys 0.05
$ grep-2.27/src/grep 01.2 inp
real 0.09
user 0.04
sys 0.04
$ grep-2.28/src/grep 01.2 inp
real 0.09
user 0.04
sys 0.04
$ grep-3.0/src/grep 01.2 inp
real 0.09
user 0.04
sys 0.04
$ grep-3.1/src/grep 01.2 inp
real 0.11
user 0.05
sys 0.06
$ grep-3.2/src/grep 01.2 inp
real 0.37
user 0.32
sys 0.04
$ grep-3.3/src/grep 01.2 inp
real 0.29
user 0.25
sys 0.04
Thanks,
Norihiro
0001-dfa-separate-parse-and-compile-phase.patch
Description: Text document
0001-grep-a-kwset-matcher-not-work-in-a-grep-matcher.patch
Description: Text document
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher |
Date: |
Thu, 19 Dec 2019 19:41:47 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 |
On 12/11/19 3:25 PM, Paul Eggert wrote:
> On 3/22/19 7:49 PM, Norihiro Tanaka wrote:
>> Missing a patch for dfa. Re-send correct patch file.
>
> Thanks, I installed the DFA-relevant parts of your proposed fix into Gnulib.
> (The grep parts still need doing.)
I finally got around to reviewing the grep parts, and installed them into 'grep'
master. Thanks again for the fix, and sorry about the delay. Closing the bug
report, as the original bug has been fixed (though we can still talk about what
name to give ptrdiff_t, in bug-gnulib perhaps).
I followed up with this NEWS entry:
A performance bug has been fixed for patterns like '01.2' that
cause grep to reorder tokens internally.
[Bug#34951 introduced in grep 3.2]
--- End Message ---