[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH] fall back to glibc matcher if a multibyte match is foun
From: |
Paolo Bonzini |
Subject: |
Re: [RFC PATCH] fall back to glibc matcher if a multibyte match is found |
Date: |
Sat, 1 May 2010 09:45:46 +0200 |
>> This patch works around the performance problems that are still in
>> current grep. Red Hat will probably be using it in its own 2.6.x.
>>
>> For UTF-8 it should trigger only in the presence of MBCSET, e.g. [a-z]
>> or [à] (nad the latter case could be avoided).
>>
>> For other character sets all brackets, and `.' as well, will trigger it.
>
> Sounds like a good change, but please add a comment.
> Can you suggest a pathologically bad example
> with which we can try to come up with a performance-measuring
> addition to the test suite?
If I read correctly the matcher code, it is still an NFA, so it's
O(nodes * input-length). So it's difficult to find a pathological
case, even though the slowdown is over 200x.
Paolo
- Re: [RFC PATCH] fall back to glibc matcher if a multibyte match is found,
Paolo Bonzini <=