[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#17420: [PATCH] grep: always convert fgrep to grep
From: |
Norihiro Tanaka |
Subject: |
bug#17420: [PATCH] grep: always convert fgrep to grep |
Date: |
Sat, 10 May 2014 00:15:59 +0900 |
Paul Eggert wrote:
> but as long as significant slowdowns are rare, that's OK.
That's graceful. However, I concern two slower cases.
$ echo a | env LC_ALL=C time -p src/grep -Ff /usr/share/dict/linux.words
real 1.34 user 1.26 sys 0.07
$ echo a | env LC_ALL=C time -p src/grep -f /usr/share/dict/linux.words
real 56.79 user 6.33 sys 48.79
$ yes /usr/share/dict/linux.words | head -100 | xargs cat > k
$ printf 'Python\nPerl\nPascall\nProlog\nPHP\nRuby\nHaskell\nLisp\nScheme\n' |
env LC_ALL=C time -p src/grep -Ff - k >/dev/null
real 1.84 user 1.78 sys 0.05
$ printf 'Python\nPerl\nPascall\nProlog\nPHP\nRuby\nHaskell\nLisp\nScheme\n'
env LC_ALL=C time -p src/grep -f - k >/dev/null
real 2.26 user 2.19 sys 0.06
Now, Beate Commentz-Waltertz Walter algorithm in KWset is used by only
fgrep matcher. Therefore if it's effective, fgrep matcher is faster
than grep matcher. In addition, Beate Commentz-Waltertz Walter algorithm
is more smaller memory consumption than the DFA.
However, below is very slow, so that Beate Commentz-Waltertz Walter
algorithm in KWset hasn't impremented Galil rule yet.
$ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 >k
$ printf 'kjjjjjjjjjjjjjjjjjjj\nq\n' | env LC_ALL=C src/grep -Ff - k
real 22.67 user 18.31 sys 3.64
$ printf 'kjjjjjjjjjjjjjjjjjjj\nq\n' | env LC_ALL=C src/grep -f - k
real 1.09 user 1.03 sys 0.05
Thanks,
Norihiro