[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bash "extglob" needs to upgrade at least like zsh "kshglob"

From: Martin D Kealey
Subject: Re: bash "extglob" needs to upgrade at least like zsh "kshglob"
Date: Sun, 30 Oct 2022 19:58:22 +1000

On Sat, 29 Oct 2022 at 22:15, Greg Wooledge <greg@wooledge.org> wrote:

> On Sat, Oct 29, 2022 at 04:50:00PM +1100, Martin D Kealey wrote:
> > This seems like a good reason to simply translate extglobs into regexes,
> > which should run in linear time, rather than put effort into building and
> > debugging a parallel implementation.
> This isn't straightforward, because of the !(list) feature of extglob.
> There's no analogous construct for that in standard regexes.

That's true. and my "simply" is understating the complexity of the task,
but it's not impossible.

There was a recent discussion about this very point on
irc://libera.chat#Bash where it was pointed out that supporting the !(list)
style of inversions in the regex compiler is simply a matter of inverting
the "success" or "failure" indications attached to the states when
compiling a regex, which means that it works correctly when these
constructs are nested.

So the options would seem to be:
(a) prohibit inversions (you get to pick EITHER extglob or rexglob, not
(b) bypass convert-to-regex when inversions are present;
(c) use PCRE or Vim RE, which already support negations (though not in the
same form); note that these do not have linear ("regular") time performance
in any but the trivial cases;
(d) compute the "inverse" regex for a given negation (this may require Vim
RE, see below);
(e) post-process the compiled regex (this would be highly dependent on the
specific RE implementation);
(f) pick an existing regex engine, and add the necessary logic to handle
negations and conjunctions (see below);

A particular difficulty is handling !(!(X)|!(Y)) which by de Morgan's law
translates into @(X&Y) - meaning a target needs to match BOTH X and Y. Only
Vim's Regex engine has direct support for that construct, and it exhibits
non-linear timing when using it.

A naïve implementation could easily require state table space that's
exponential in the breadth of the conjunctions, meaning that the user could
very easily write a rexglob that cannot be compiled with available memory.
So there needs to be a fallback position: either fail, telling the user we
cannot honour the linear speed guarantee, or "try anyway" with a procedure
that could exhibit truly horrible time complexity.


PS: I say "we" and "us" advisedly; let's not leave everything to Chet.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]