Re: Bracket expressions with character ranges are slow

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bracket expressions with character ranges are slow

From:	Paolo Bonzini
Subject:	Re: Bracket expressions with character ranges are slow
Date:	Tue, 07 Jun 2011 13:35:57 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.10

On 05/18/2011 10:40 PM, Paolo Bonzini wrote:

Suppose grep had a preprocessor that converted any bracket
expression containing elements of different byte sizes, whether
[美国a] or a range not all of whose characters are a single byte,
into a parenthesized alternation like (美|国|a).  Would this use
more memory, constituting a space-for-time tradeoff?  If not, is
there some other reason not to do this?


There's no justification but laziness. :)  We already optimized a large
amount of character ranges---basically all that can be optimized except
this one.


This is now implemented in grep.git.

I realize this is only potentially possible for egrep, at least at
the surface level of rewriting the regular expression.

Since the optimization is done inside the matcher, it does not depend onextended regex.


Paolo

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Bracket expressions with character ranges are slow, Paolo Bonzini <=
- Re: Bracket expressions with character ranges are slow, Paolo Bonzini, 2011/06/07

Prev by Date: Re: [PATCH 3/4] dfa: refactor to prepare for upcoming optimizations
Next by Date: Re: Bracket expressions with character ranges are slow
Previous by thread: [PATCH 0/4] dfa: [à] optimization and character range fix
Next by thread: Re: Bracket expressions with character ranges are slow
Index(es):
- Date
- Thread