[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#39678: POSIXLY_CORRECT removal, and oddball regex doc
From: |
Paul Eggert |
Subject: |
bug#39678: POSIXLY_CORRECT removal, and oddball regex doc |
Date: |
Sun, 22 May 2022 15:22:59 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 |
On 5/21/22 11:40, Jim Meyering wrote:
In my experience, there are many lurking uses of things like '\a', and
would like to ease into this gently, so I much prefer your latter
approach: warn now, and change grep's exit status later
Sounds good.
When I started looking into that, I discovered that the grep manual
doesn't cover these lurkers well. And although I installed a patch
yesterday about this, after looking at the POSIX spec again today I
discovered that I'd missed quite a few lurkers. So I just now installed
the attached documentation fix, which attempts to cover all the
remaining problem regexps, and to give us room to add warnings for some
of them soon.
We shouldn't warn about all these problems, not without a --pedantic
flag or something like that (something I'm probably too busy to add).
But I expect it'd be good to warn about areas where grep's semantics
don't match any reasonable expectation.
We've already uncovered one area, where \a doesn't work as expected and
where a warning diagnostic would be helpful. Here's another one, where
an oddly-placed '*' doesn't work as one would expect:
$ printf '*\na\n*a\n' | grep '\(*\)'
*
*a
$ printf '*\na\n*a\n' | grep -E '(*)'
grep: Unmatched ( or \(
$ printf '*\na\n*a\n' | grep '\(*a\)'
*a
$ printf '*\na\n*a\n' | grep -E '(*a)'
a
*a
Although not a POSIX violation, here 'grep -E' is "wrong" for any
reasonable definition of "wrong" that I can think of. The attached patch
changes the doc to say that this regular expression has unspecified
behavior (something that POSIX allows).
(Who would have thought regular expressions were so complicated? :-)
0001-doc-document-regex-corner-cases-better.patch
Description: Text Data