bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24009: [PATCH] grep: use fastmap in regex


From: Jens Schleusener
Subject: bug#24009: [PATCH] grep: use fastmap in regex
Date: Sat, 16 Jul 2016 22:06:53 +0200 (CEST)
User-agent: Alpine 2.20 (LSU 67 2015-01-07)

Hi Norihiro.

sed and gawk use fastmap in regex, but grep does not.  By using fastmap,
I expect that grep speeds up for patterns as regex is used.

before:
$ time -p env LC_ALL=ja_JP.eucjp src/grep '\([a-b]\)\1' k
real 7.83
user 7.62
sys 0.07

after:
$ time -p env LC_ALL=ja_JP.eucjp src/grep '\([a-b]\)\1' k
real 0.46
user 0.38
sys 0.07

However, if grep uses fastmap, fails in case-fold-titlecase test.  It
means that grep's behavior differ from sed and gawk, as they use fastmap,
although it seems to be a bug in regex.

Wow, that is a spectacular speed improvement. Since I use grep with regex patterns heavily in some of my scripts I could not resist to make some first simple tests (including your example pattern with a back reference). The non-representative results using grep 2.25 shows a gain of a factor 5-10 (while the unpatched self-compiled grep 2.25 itself was already a factor 1.4-2.8 faster than the grep 2.16 offered by the OS (OpenSUSE Leap 42.1). At least in my tests all the grep outputs were identical.

By the way I had to remove one of the two "=" in your patch otherwise gcc issued an error (but caution, I am a C-layman).

Regards

Jens





reply via email to

[Prev in Thread] Current Thread [Next in Thread]