Re: Stack overflow in regexp matcher

Barry Margolin
Re: Stack overflow in regexp matcher
Thu, 17 Dec 2009 12:01:10 -0500
User-agent: MT-NewsWatcher/3.5.3b3 (Intel Mac OS X)

In article <address@hidden>,
 "address@hidden" <address@hidden> wrote:

> A function of mine gets "Stack overflow in regexp matcher" on a certain 
> file using the regexp
>      "^< \\(.+].+=|\\)"
> The file is a single line of about 73000 characters, but the regexp matches 
> ending at character 56.
> Is this a bug?  And if not, why not?
> djc

The problem is that + is greedy, so this has to scan the entire line 
looking for the last "]", then see if there's an "=" somewhere later.  
If it can't find an "=" it has to back up to the previous "]" and search 
again, and so on.  If there are lots of "]" characters, this has to save 
the state of each of them in the matching stack.

Barry Margolin
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***

