[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] fall back to glibc matcher if a MBCSET is found
From: |
Jim Meyering |
Subject: |
Re: [PATCH] fall back to glibc matcher if a MBCSET is found |
Date: |
Sun, 12 Sep 2010 10:46:30 +0200 |
Paolo Bonzini wrote:
> On 09/08/2010 11:05 AM, Jim Meyering wrote:
>> Thank you for the patch.
>>
>> If this change really does fix a correctness bug,
>> then it deserves a NEWS entry with enough detail to confirm that,
>> and, if at all possible, a test suite addition.
>
> It fixes equivalence classes (e.g. matching [[=a=]] against à), but
> only --without-included-regex. See attached patches.
>
> The presence of this check in regex.m4
>
> if (sizeof (regoff_t) < sizeof (ptrdiff_t)
> || sizeof (regoff_t) < sizeof (ssize_t))
>
> unfortunately means that all existing systems will use the inferior
> gnulib regex rather than glibc regex. In turn, this means that grep
> will nowhere support equivalence classes out-of-the-box.
>
>> Similarly, if it works around a performance problem,
>> it would help me evaluate it if you were to provide evidence.
>
> yes 1234567890123456789012345678901234567890123456789012567890 | \
> sed 100000q | time ./grep '[a-z]'
>
> shows 0.91s with the patch and 1.21s without. Since this is not an
> asymptotic improvement, it is hard to test it reliably, and is
> secondary anyway compared to the correctness problem above.
Hi Paolo,
That patch induces a performance *decrease* on at least one system.
Built using --without-included-regex
Run on an idle i920 @ 2.67GHz, kernel 2.6.18-194.11.3.el5PAE, i686:
yes 1234567890123456789012345678901234567890123456789012567890 |sed 100000q >
in
for i in $(seq 10); do env time --f=%E env LC_ALL=fr_FR.UTF8 \
./grep '[a-z]' in;done
With your patch:
0:01.76
0:01.76
0:01.82
0:01.77
0:01.77
0:01.84
0:01.76
0:01.78
0:01.80
0:01.80
without it:
0:01.71
0:01.68
0:01.70
0:01.73
0:01.72
0:01.71
0:01.70
0:01.70
0:01.71
0:01.70
Also, on that same system, which happens to use centos 5.5 and
glibc-2.5-49.el5_5.4, your new test fails when built --without-included-regex.
Sorry I don't have time to investigate.
- [PATCH] fall back to glibc matcher if a MBCSET is found, Paolo Bonzini, 2010/09/08
- Re: [PATCH] fall back to glibc matcher if a MBCSET is found, Jim Meyering, 2010/09/08
- Re: [PATCH] fall back to glibc matcher if a MBCSET is found, Paolo Bonzini, 2010/09/08
- Re: [PATCH] fall back to glibc matcher if a MBCSET is found,
Jim Meyering <=
- Re: [PATCH] fall back to glibc matcher if a MBCSET is found, Paolo Bonzini, 2010/09/12
- Re: [PATCH] fall back to glibc matcher if a MBCSET is found, Jim Meyering, 2010/09/12
- Re: [PATCH] fall back to glibc matcher if a MBCSET is found, Paolo Bonzini, 2010/09/13
- Re: [PATCH] fall back to glibc matcher if a MBCSET is found, Jim Meyering, 2010/09/13
- Re: [PATCH] fall back to glibc matcher if a MBCSET is found, Paolo Bonzini, 2010/09/13
- Re: [PATCH] fall back to glibc matcher if a MBCSET is found, Paolo Bonzini, 2010/09/13