emacs-tangents
[Top][All Lists]

## Negating a regexp

 From: Yuri Khan Subject: Negating a regexp Date: Thu, 20 May 2021 17:29:50 +0700

```> > How could I negate the regexp that I have defined?
>
> I don't even know what you mean by "negating a regexp".

The automata theory, where the notion of a regexp comes from, defines
a “regular set” as one that can be described using a regexp (where
regexps are defined to be implicitly anchored to beginning and end of
string, and support concatenation, alternative, and iteration, but not
backreferences). It then proves that for each regexp there exists an
equivalent nondeterministic finite automaton, and for every NDFA there
exists an equivalent DFA, and for every DFA there exists an equivalent
regexp. It also proves that the class of regular sets is closed under
set theory operations — union, intersection, and complement.

It follows that, theoretically, a regexp ‘R’ can be negated — one can
construct a regexp ‘(not R)’ that matches exactly all strings that R
does not match.

However, the proofs and constructions are complex enough that in
practice such a regexp would be unreadable.

As an example, consider the regexp ‘a’ which matches only a single
one-character string. Its complement would be a regexp that matches
the empty string, all one-character strings whose character is not
‘a’, and all strings longer than one character. The shortest way to
express that idea that I can think of is ‘|[^a]|.{2,}’ and that’s
ignoring the issue of newlines.

So, to steve-humphreys: There is no practical general way to negate a
regexp. You need to either negate the result of attempting to match,
or to think hard and write a new regexp that matches what you want.

```