bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#43577: wrong result for grep -io in turkish locale


From: Paul Eggert
Subject: bug#43577: wrong result for grep -io in turkish locale
Date: Wed, 23 Sep 2020 19:57:36 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 9/23/20 6:47 PM, Norihiro Tanaka wrote:
I attach the fix for the bug.  Regex is fixed in Paul, thank you.


Thanks, I had written a similar patch, and your patch helped me find a bug in what I wrote. The patch I wrote uses an auxiliary ok_fold table that lets fgrep_icase_charlen avoid calling mbrtwoc for single-byte characters in the pattern; this may help performance for long patterns. More important, fgrep_icase_charlen does not return -1 for a character like 'a' in an en_US.UTF-8 locale merely because 'a' has a case folded counterpart 'A'; the idea is that we should be OK if the case folded counterparts are single-byte.

I had added more-extensive tests than were in your patch, and some of them found a crash in kwsinit that indicated a similar change is needed there. I assume this was because the patch I wrote had a more-generous fgrep_icase_charlen. As this simplifies kwsinit, this patch does that too.

While looking into this I found a performance glitch I recently introduced (I double-counted some regular expressions, messing up later heuristics). Plus I checked on this on our old Solaris 10 box and fixed a couple of porting glitches. I installed the attached patches, into the master branch, to help make it easier for you to compare your changes to mine. Patch 0003 is the enhanced version of the patch that you wrote.

Thanks again for working on this.

Attachment: 0001-grep-fix-recently-introduced-performance-glitch.patch
Description: Text Data

Attachment: 0002-build-update-gnulib-submodule-to-latest.patch
Description: Text Data

Attachment: 0003-grep-fix-more-Turkish-eyes-bugs.patch
Description: Text Data

Attachment: 0004-grep-pacify-Sun-C-5.15.patch
Description: Text Data

Attachment: 0005-grep-don-t-assume-PCRE-in-tests.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]