bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate


From: Koen Claessen
Subject: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate
Date: Mon, 3 Apr 2023 14:07:00 +0200

I found it when I was testing various new regular expression algorithms
against grep (which I used as the golden standard for this).

I used a random generator for regular expressions (using the QuickCheck
framework) and then shrinking/delta debugging to automatically find the
smallest failing test case.

BTW, if you are interested, I could do a larger more targeted effort stress
testing grep like this and possibly find more test cases with unexpected
behavior. I would need some guidance on where to put most effort in order
to be as useful as this can be. I could find a MSc student to help out with
that. Let me know if this sounds like an interesting thing to do!

kind regards,
/Koen

On Sun, Apr 2, 2023 at 6:15 AM Jim Meyering <jim@meyering.net> wrote:

> On Mon, Mar 27, 2023 at 6:15 AM Koen Claessen <koen@chalmers.se> wrote:
> > Running the command:
> >
> >   echo a | grep -E -w '((()|a)|())*'
> >
> > does not terminate, and uses a LOT of processor time, for all versions of
> > grep I have tried.
> >
> > This is the smallest case that could be found; simplifying anything in
> the
> > input and/or expression leads to correct behavior.
>
> Thank you! How did you find that?
>
> FYI, this strikes grep-3.10 (on Fedora 37/glibc-2.36-9.fc37.x86_64)
> when using LC_ALL=en_US.UTF-8, but not with LC_ALL=C.
> I.e., this infloops:
>    echo a | LC_ALL=en_US.UTF-8 grep -E -w '((()|a)|())*'
>
> but this works as expected and promptly prints its line of input:
>      echo a | LC_ALL=C grep -E -w '((()|a)|())*'
>
> For now, I've added an expected-failing test case for this bug:
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]