[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are caten
bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated
Thu, 05 Jun 2014 09:48:41 -0700
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0
On 06/05/2014 04:32 AM, Norihiro Tanaka wrote:
+ memchr (cp, *lookfor, lenin - (cp - lookin));
+ if (!cp)
Thanks, but this part can't be right, as memchr's result is discarded.
It seems to me that much of the performance benefit comes from using a
faster implementation of strstr, and that the DFA code will be better
off if it simply uses the system strstr rather than rolling its own.
(The DFA code dates back before strstr was standardized, which is why it
has its own implementation.)
I installed the attached patch to do that and got a big speedup:
$ printf '%08192d\n' 0 | time -p src/old/grep -f - /dev/null
$ printf '%08192d\n' 0 | time -p src/grep -f - /dev/null
Could you please look at the remaining part of your patch and see
whether it's a win if it's merged to what's now installed? Thanks.
PS. Aharon, I assume this'll affect Gawk, in that you'll need to
provide a strstr if you want to be portable to ancient systems that lack
it. strstr was standardized in C89 so it'd have to be a pretty ancient
system, and it may be better just to let this slide.
Description: Text Data