emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Emacs-diffs] emacs-25 fe27e03: Fix rx matcher overflow without limi


From: Tassilo Horn
Subject: Re: [Emacs-diffs] emacs-25 fe27e03: Fix rx matcher overflow without limiting
Date: Mon, 14 Dec 2015 18:56:06 +0100
User-agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/25.0.50 (gnu/linux)

Stefan Monnier <address@hidden> writes:

>> -      
>> "\\[[^][]\\{0,2000\\}\\<label[[:space:]]*=[[:space:]]*{?\\(?1:[^],}]+\\)}?")
>> +      ;;
>> +      ;; If you think the first shy group is a bit strange, it is like
>> +      ;; that in order not to overflow the regexp matcher stack in the
>> +      ;; presence of unbalanced brackets, i.e., a [ and then no
>> +      ;; closing bracket anymore.  In "[^[],]*,", the "*" repetition
>> +      ;; will be done without any need to record state for eventual
>> +      ;; backtracking because the "," is mutually exclusive with the
>> +      ;; "[^][,]", and the regexp matcher includes a special
>> +      ;; optimization for that case since it's common and very
>> +      ;; useful).  (Hint by Stefan Monnier)
>> +      "\\[\\(?:[^][,]*,\\)*[ 
>> \t]*\\<label[[:space:]]*=[[:space:]]*{?\\(?1:[^],}]+\\)}?")
>
> As mentioned in my email, this doesn't remove all risks of overflows,
> since the second * still requires stashing the state onto the stack for
> backtracking, but that's only "once per comma" instead of "once
> per char", so in practice we can hope we won't bump into it.

I thought I had understood that but apparently not.  My guess was that
commas improve the situation whereas it's exactly the other way round.
So with my test file

--8<---------------cut here---------------start------------->8---
\documentclass{article}
\begin{document} 
[\}

foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo
foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo
[100000 times that line again]
--8<---------------cut here---------------end--------------->8---

I did not get an overflow.  But after changing the file so that every
foo-line ends with a comma (which is more realistic "prose"), the
overflow re-appeared.

So in the end I reverted my commit.  On the one hand, the test file is
not overly realistic, i.e., usually when you use unbalanced brackets
you're probably writing some math stuff, and then you'll probably also
have a closing ] somewhere within reach.  But on the other hand, the
limit of the label=... being at least 2000 chars away from the beginning
of the optional arguments block is safer, simpler, and won't possibly be
invalidated.

Bye,
Tassilo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]