bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

why is grep so slow?


From: Wacek Kusnierczyk
Subject: why is grep so slow?
Date: Fri, 24 Apr 2009 10:50:19 +0200
User-agent: Thunderbird 2.0.0.21 (X11/20090318)

i have a >1GB text file (say, input), and want to count lines matching
some pattern (say, '^>>').  using grep, i got the following timings:

    time (grep -c '^>>' input)
    # ~6m20s

    time (grep '^>>' input | wc -l)
   # ~5m20s

sed is much faster:

    time (sed -n '/^>>/p' input | wc -l)
    # ~0m5s

what's the difference between grep and sed that makes grep so much
slower here?

interestingly,

    time (grep -cP '^>>' input)
    # ~0m0.2s

it could be that grep buffers the lines before it outputs them, and this
causes slowdown on large files, but then -P would not change it, would
it?  or does -P change not only regexing, but also outputting?

in all the examples above, the actual output (the line count) was correct.

vQ




reply via email to

[Prev in Thread] Current Thread [Next in Thread]