bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#17420: [PATCH] grep: always convert fgrep to grep


From: Norihiro Tanaka
Subject: bug#17420: [PATCH] grep: always convert fgrep to grep
Date: Sat, 10 May 2014 00:15:59 +0900

Paul Eggert wrote:
> but as long as significant slowdowns are rare, that's OK.

That's graceful.  However, I concern two slower cases.

$ echo a | env LC_ALL=C time -p src/grep -Ff /usr/share/dict/linux.words
    real 1.34       user 1.26       sys 0.07
$ echo a | env LC_ALL=C time -p src/grep -f /usr/share/dict/linux.words
    real 56.79      user 6.33       sys 48.79

$ yes /usr/share/dict/linux.words | head -100 | xargs cat > k
$ printf 'Python\nPerl\nPascall\nProlog\nPHP\nRuby\nHaskell\nLisp\nScheme\n' |
  env LC_ALL=C time -p src/grep -Ff - k >/dev/null
    real 1.84       user 1.78       sys 0.05
$ printf 'Python\nPerl\nPascall\nProlog\nPHP\nRuby\nHaskell\nLisp\nScheme\n'
  env LC_ALL=C time -p src/grep -f - k >/dev/null
    real 2.26       user 2.19       sys 0.06

Now, Beate Commentz-Waltertz Walter algorithm in KWset is used by only
fgrep matcher.  Therefore if it's effective, fgrep matcher is faster
than grep matcher.  In addition, Beate Commentz-Waltertz Walter algorithm
is more smaller memory consumption than the DFA.

However, below is very slow, so that Beate Commentz-Waltertz Walter
algorithm in KWset hasn't impremented Galil rule yet.

  $ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 >k
  $ printf 'kjjjjjjjjjjjjjjjjjjj\nq\n' | env LC_ALL=C src/grep -Ff - k
    real 22.67      user 18.31      sys 3.64
  $ printf 'kjjjjjjjjjjjjjjjjjjj\nq\n' | env LC_ALL=C src/grep -f - k
    real 1.09       user 1.03       sys 0.05

Thanks,
Norihiro






reply via email to

[Prev in Thread] Current Thread [Next in Thread]