[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#37849: composable character alternatives in rx
From: |
Mattias Engdegård |
Subject: |
bug#37849: composable character alternatives in rx |
Date: |
Fri, 6 Dec 2019 22:58:46 +0100 |
This patch adds `union' and `intersection' to rx. They both take zero or more
charsets as arguments. A charset is either an `any' form that does not contain
character classes, a `union' or `intersection' form, or a `not' form with
charset argument.
Example:
(rx (union (any "a-f") (any "b-m")))
=> "[a-m]"
(rx (intersection (any "a-f") (any "b-m")))
=> "[b-f]"
The character class limitation stems from the inability to complement or
intersect classes in general. It would be possible to partially lift this
restriction for `union'; it is clear that
(rx (union (any "ab" space) (any "bc" space digit)))
=> "[abc[:space:][:digit:]]"
but it makes the facility harder to explain to the user in a way that makes
sense. Still, it could be a future extension.
A `difference' operator was not included but could be added; it is trivially
defined in rx as
(rx-define difference (a b)
(intersection a (not b)))
The names `union' and `intersection' are verbose, but should be rare enough
that it's better with something descriptive.
SRE, from where the concept was taken, uses `|' and `&' respectively, and `~'
for complement, `-' for difference.
0001-Add-union-and-intersection-to-rx-bug-37849.patch
Description: Binary data
- bug#37849: composable character alternatives in rx,
Mattias Engdegård <=