[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
New rx implementation with extension constructs
From: |
Mattias Engdegård |
Subject: |
New rx implementation with extension constructs |
Date: |
Mon, 2 Sep 2019 23:19:47 +0200 |
The rx regexp notation is nice to use but the implementation isn't wonderful;
there is a proposed replacement rewritten from the ground up. It is cleaner,
has fewer bugs, and is maybe twice as fast.
Most importantly, there is now a proper extension mechanism: for global
definitions,
(rx-define snobol-identifier (seq alpha (0+ alnum))
which are available anywhere, and local ones,
(rx-let ((natnum (1+ digit))
(integer (seq (opt "-") natnum)))
...body...)
where a set of definitions are only available in a lexical scope. This
zero-cost construct can be placed inside a function, or at top-level enclosing
multiple variable and function definitions, all sharing the same named rx forms.
Both rx-define and rx-let admit two kinds of definitions:
NAME RX-FORM
NAME (ARGS...) RX-FORM
for plain rx symbols and for parametrised forms, respectively. For example:
(rx-let ((name (1+ letter))
(comma-separated (x) (seq x (0+ "," x))))
(rx (comma-separated name)))
works just as expected. &rest arguments are permitted, and expand to implicit
(seq ...) forms.
No provision was made for macros able to execute arbitrary Lisp code; I just
couldn't find a use for them, and decided to wait until someone would tell me
otherwise. Thus, all parametrised forms work by plain substitution.
The code currently resides at https://gitlab.com/mattiase/ry; it will naturally
be renamed to `rx' once it's in the Emacs tree. It can be integrated in a
separate branch of the Emacs source repo if you wish, or as patches if you
prefer that for reviewing. The diffs don't make much sense since it is a
reimplementation with very little in common with the old code.
The exact form of the extension mechanism isn't set in stone, and I'd welcome
any suggestions for improvement.
- New rx implementation with extension constructs,
Mattias Engdegård <=
Re: New rx implementation with extension constructs, Noam Postavsky, 2019/09/05
Re: New rx implementation with extension constructs, Mattias Engdegård, 2019/09/06