[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How to grok a complicated regex?
From: |
Alan Mackenzie |
Subject: |
Re: How to grok a complicated regex? |
Date: |
Wed, 18 Mar 2015 16:40:35 +0000 (UTC) |
User-agent: |
tin/2.2.0-20131224 ("Lochindaal") (UNIX) (FreeBSD/10.1-RELEASE (amd64)) |
Hi, Marcin.
Sorry if I'm a bit late to this discussion.
Marcin Borkowski <mbork@wmi.amu.edu.pl> wrote:
> Hi all,
> so I have this monstrosity [note: I know, there are much worse ones,
> too!]:
> "\\`\\(?:\\\\[([]\\|\\$+\\)?\\(.*?\\)\\(?:\\\\[])]\\|\\$+\\)?\\'"
> (it's in the org-latex--script-size function in ox-latex.el, if you're
> curious).
> I'm not asking ?what does this match? ? I can read it myself. But it
> comes with a considerable effort. Are you aware of any tools that might
> help to understand such regexen?
> I know about re-builder, but it?s well suited for constructing a regex
> matching a given string, not the other way round.
> For instance, show-paren-mode does not really help here, since it seems
> to pair ?\\(? with unescaped ?)?.
> Any ideas?
I wrote myself the following tool. It's not production quality, but you
might find it useful nonetheless. To use it, Type
M-: (pp-regexp re-horror).
It displays the regexp at the end of the *scratch* buffer, dropping the
contents of any \(..\) construct by one line. I find it useful. So might
you. Feel free to adapt it, or pass it on to other people.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun pp-regexp (regexp)
"Pretty print a regexp. This means, contents of \\\\\(s are lowered a line."
(or (stringp regexp) (error "parameter is not a string."))
(let ((depth 0)
(re (replace-regexp-in-string
"[\t\n\r\f]"
(lambda (s)
(or (cdr (assoc s '(("\t" . "??")
("\n" . "??")
("\r" . "??"))))
"??"))
regexp))
(start 0) ; earliest position still without an acm-depth property.
(pos 0) ; current analysis position.
(max-depth 0) ; How many lines do we need to print?
(min-depth 0) ; Pick up "negative depth" errors.
pr-line ; output line being constructed
line-no ; line number of pr-line, varies between min-depth and
max-depth.
ch
)
;(translate-rnt re)
;; apply acm-depth properties to the whole string.
(while (< start (length re))
(setq pos (string-match ;; "\\\\\\((\\(\\?:\\)?\\||\\|)\\)"
"\\\\\\(\\\\\\|(\\(\\?:\\)?\\||\\|)\\)"
re start))
(put-text-property start (or pos (length re)) 'acm-depth depth re)
(when pos
(setq ch (aref (match-string 1 re) 0))
(cond
((eq ch ?\\)
(put-text-property pos (match-end 1) 'acm-depth depth re))
((eq ch ?\()
(put-text-property pos (match-end 1) 'acm-depth depth re)
(setq depth (1+ depth))
(if (> depth max-depth) (setq max-depth depth)))
((eq ch ?\|)
(put-text-property pos (match-end 1) 'acm-depth (1- depth) re)
(if (< (1- depth) min-depth) (setq min-depth (1- depth))))
(t ; (eq ch ?\))
(setq depth (1- depth))
(if (< depth min-depth) (setq min-depth depth))
(put-text-property pos (match-end 1) 'acm-depth depth re))))
(setq start (if pos (match-end 1) (length re))))
;; print out the strings
(setq line-no min-depth)
(while (<= line-no max-depth)
(with-current-buffer "*scratch*"
(goto-char (point-max)) (insert ?\n)
(setq pr-line "")
(setq start 0)
(while (< start (length re))
(setq pos (next-single-property-change start 'acm-depth re (length
re)))
(setq depth (get-text-property start 'acm-depth re))
(setq pr-line
(concat pr-line
(if (= depth line-no)
(substring re start pos)
(make-string (- pos start) ?\ ))))
(setq start pos))
(insert pr-line)
(setq line-no (1+ line-no))))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> (Note: if there are no such tools, I might be tempted to craft one. Two
> things that come to my mind are proper highlighting of matching parens
> of various kinds and eldoc-like hints for all the regex constructs ?
> I never seem to remember what does ?\\`? do, for instance. Also,
> displaying the string with single backslashes and not in the way it is
> actually typed in in Elisp, with all the backslash escaping, might be
> helpful. Would there be a demand for such a tool larger than one
> person?)
> Best,
> --
> Marcin Borkowski
> http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
> Faculty of Mathematics and Computer Science
> Adam Mickiewicz University
--
Alan Mackenzie (Nuremberg, Germany).
- Re: How to grok a complicated regex?, (continued)
- Re: How to grok a complicated regex?, Yuri Khan, 2015/03/14
- RE: How to grok a complicated regex?, Drew Adams, 2015/03/14
- Message not available
- Re: How to grok a complicated regex?, Emanuel Berg, 2015/03/13
- Re: How to grok a complicated regex?, Emanuel Berg, 2015/03/14
- Re: How to grok a complicated regex?, Emanuel Berg, 2015/03/14
- Re: How to grok a complicated regex?, Thien-Thi Nguyen, 2015/03/14
- Message not available
- Re: How to grok a complicated regex?, Emanuel Berg, 2015/03/19
How to grok a complicated regex?, martin rudalics, 2015/03/14
Re: How to grok a complicated regex?,
Alan Mackenzie <=