bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales


From: Paul Eggert
Subject: bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales
Date: Tue, 16 Sep 2014 18:43:26 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.1

I worked on this some more, and came up with the attached patches proposed against the current grep Savannah master (commit 9ea9254ea58456b84ed2f0c1481ca91cdd325bf7).

For years I've been wanting to write that last patch and I finally got around to it. It improves grep -P's performance by a factor of 1.2 trillion on one (admittedly artificial) benchmark. I hope its 1 ZB/s scan rate is some kind of record. The last patch probably won't help your test cases, though I hope the other patches do help somewhat.

Attachment: 0001-grep-refactor-binary-vs-unknown-vs-text-flags-for-cl.patch
Description: Text document

Attachment: 0002-grep-z-no-longer-considers-200-to-be-binary-data.patch
Description: Text document

Attachment: 0003-grep-non-text-bytes-in-binary-data-may-be-treated-as.patch
Description: Text document

Attachment: 0004-grep-minor-P-speedup-with-jit_stack.patch
Description: Text document

Attachment: 0005-grep-improve-P-performance-in-typical-cases.patch
Description: Text document

Attachment: 0006-grep-skip-past-holes-efficiently.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]