[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x)
From: |
Pádraig Brady |
Subject: |
bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales |
Date: |
Sun, 12 Jan 2014 12:56:23 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 |
On 01/12/2014 04:36 AM, Jim Meyering wrote:
> On Sat, Jan 11, 2014 at 6:15 AM, Pádraig Brady <address@hidden> wrote:
>> On 01/11/2014 11:33 AM, Pádraig Brady wrote:
> ...
>> This is also a good summary of stuff to consider with case:
>> http://www.unicode.org/faq/casemap_charprop.html
>>
>> So picking another case situation from there:
>> "in the Greek script, capital sigma (U+03A3) is the uppercase form of both
>> the regular (U+03C2) and final (U+03C3) lowercase sigma."
>>
>> One can see that sed handles this:
>> $ printf '\u03C2\u03C3\n' | sed 's/.*/&\U&/'
>> ςσΣΣ
>> $ printf '\u03A3\n' | sed 's/.*/&\L&/'
>> Σσ
>>
>> Though I was surprised the grep (2.14) didn't match any combo of these
>> $ printf '\u03C2\u03C3\n' | grep -Fi "$(printf \u03A3)"
>> $ printf '\u03A3\n' | grep -Fi "$(printf \u03C2)"
>> $ printf '\u03A3\n' | grep -Fi "$(printf \u03C3)"
>
> Actually, if you quote the argument to the latter printf, two of those
> do match, both with -F and without:
>
> $ printf '\u03C2\u03C3\n' | grep -Fi "$(printf '\u03A3')"
> ςσ
> $ printf '\u03A3\n' | grep -Fi "$(printf '\u03C2')"
> $ printf '\u03A3\n' | grep -Fi "$(printf '\u03C3')"
> Σ
Oops right.
So that's still no regression with the new scheme
since grep is 1:1 here for Σ and σ.
thanks,
Pádraig.
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/07
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady, 2014/01/10
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/10
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/11
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady, 2014/01/11
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady, 2014/01/11
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/11
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/11
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales,
Pádraig Brady <=