[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x)
bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Mon, 23 Dec 2013 15:12:26 -0800
On Mon, Dec 23, 2013 at 2:52 PM, Eric Blake <address@hidden> wrote:
> On 12/23/2013 03:39 PM, Jim Meyering wrote:
>> FYI, here is a quick and clean/safe performance improvement for grep -i.
>> I expect to push this commit right after the upcoming bug-fix release.
>> Currently, this optimization is enabled when the search string is
>> ASCII and contains neither of '\' (backslash) nor '['. I expect to
>> eliminate the latter two constraints in a follow-on commit including
>> tests to exercise all of the corner cases.
>> + /* Worst case is that every byte of keys will be alpha,
>> + so every byte B will map to the sequence of 4 bytes [Bb]. */
> Umm, is this always true? Consider the UTF-8 Turkish locale, where
Thanks for the review.
Did you miss the "isascii" check in the new trivial_case_convert function?
If you can describe circumstances in which the new patch malfunctions,
but everything you wrote seems to rely on a false assumption.
E.g., your turkish-I example works fine with my patch.