bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#37849: composable character alternatives in rx


From: Mattias Engdegård
Subject: bug#37849: composable character alternatives in rx
Date: Fri, 6 Dec 2019 22:58:46 +0100

This patch adds `union' and `intersection' to rx. They both take zero or more 
charsets as arguments. A charset is either an `any' form that does not contain 
character classes, a `union' or `intersection' form, or a `not' form with 
charset argument.

Example:

(rx (union (any "a-f") (any "b-m")))
=> "[a-m]"

(rx (intersection (any "a-f") (any "b-m")))
=> "[b-f]"

The character class limitation stems from the inability to complement or 
intersect classes in general. It would be possible to partially lift this 
restriction for `union'; it is clear that

(rx (union (any "ab" space) (any "bc" space digit)))
=> "[abc[:space:][:digit:]]"

but it makes the facility harder to explain to the user in a way that makes 
sense. Still, it could be a future extension.

A `difference' operator was not included but could be added; it is trivially 
defined in rx as

(rx-define difference (a b)
  (intersection a (not b)))

The names `union' and `intersection' are verbose, but should be rare enough 
that it's better with something descriptive.
SRE, from where the concept was taken, uses `|' and `&' respectively, and `~' 
for complement, `-' for difference.

Attachment: 0001-Add-union-and-intersection-to-rx-bug-37849.patch
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]