[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: (error "Stack overflow in regexp matcher") and (?)wrong display of r
From: |
Mattias Engdegård |
Subject: |
Re: (error "Stack overflow in regexp matcher") and (?)wrong display of regexp in backtrace |
Date: |
Sun, 15 Mar 2020 13:22:20 +0100 |
15 mars 2020 kl. 11.39 skrev Alan Mackenzie <address@hidden>:
> Hello, Emacs.
Hello Alan. Thanks for the nice example!
> First of all, note the regexp, "\\(\\\\\\(.\\|\n\\)\\|[^\\\n\15]\\)*"
> ^^^
> In the source, the "\15" is "\r". Why is this substitution being made
> for the backtrace? Is it intentional (in which case, why not do the
> same to the "\n"?), or is it a bug? To me, it is more like a bug.
I agree; there are some ad-hoc switches like print-escape-newlines (which only
works on \n and \f) and print-escape-control-characters (which produces octal),
but nothing that gives human-friendly escapes for other known control
characters.
> More importantly, why is there a stack overflow here at all? Even
> though the regexp matcher has a long, long piece of buffer to scan over,
> the regexp is a simple linear search, without any nesting to speak of.
Let's ask xr for help:
(xr-pp "\\(\\\\\\(.\\|\n\\)\\|[^\\\n\15]\\)*")
=>
(zero-or-more
(group
(or (seq "\\"
(group anything))
(not (any "\n\r\\")))))
(note that xr pretty-prints \r properly)
There are two capture groups here, neither of which are actually used. Remove
them (the outer one in particular) and the regexp no longer overflows.
Navigating the file also becomes noticeably faster. Like this:
(rx (zero-or-more
(or (seq "\\" anything)
(not (any "\n\r\\")))))
(rx will use a slightly more efficient rendition of 'anything', but that isn't
actually important in this case.)