bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [1003.1(2008)/Issue 7 0000305]: Allow RE handling to reject suspicio


From: Eric Blake
Subject: Re: [1003.1(2008)/Issue 7 0000305]: Allow RE handling to reject suspicious uses
Date: Wed, 01 Sep 2010 16:55:25 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.8) Gecko/20100806 Fedora/3.1.2-1.fc13 Mnenhy/0.8.3 Thunderbird/3.1.2

[adding bug-grep, for any additional feedback from others involved in raising this issue in the first place - this is feedback to http://austingroupbugs.net/view.php?id=305]

On 09/01/2010 04:34 PM, Glenn Fowler wrote:

just to you to verify my understanding:

so the forms in question are the following outside of [ ... ]:

        [:<name>:]
        [=<collating-element>=]
        [.<collating-element>.]

Yes, those are the only three forms where current GNU grep would like to warn at the moment. An infrastructure was proposed that would allow other warnings could be added in the future for other questionable RE, but we couldn't think of any at the moment (and any such future warnings would also have to undergo the scrutiny of whether POSIX allows such a warning).


and the proposal is to allow an implementation to reject the forms

I admit getting bit by this while working on grep and tr at the same time
(tr allows ``tr '[:lower:]' '[:upper:]')

Which is why GNU grep added the warning in the first place; the question at hand on the GNU list is whether the warning must be inhibited by POSIXLY_CORRECT, or whether POSIX can condone the warning as a useful QoI feature.


how about taking the extra step of specifying the behavior of the
above forms:

        RE                              INTERPRET AS
        [:<name>:]                        [[:<name>:]]
        [=<collating-element>=]           [[=<collating-element>=]]
        [.<collating-element>.]           [[.<collating-element>.]]

Possible, but requiring a particular interpretation different from the current one would render existing implementations non-compliant.


I don't remember the details of the original discussions leading
to [...[:<name>:]...] -- how controversial was it since it potentailly
changes the semantics of previously valid REs?  if it wasn't
earth-shattering maybe specifying the probably-meant-this behavior
would be possible

if not, then allowing diagnostics would be ok with me

In my mind, the 'unspecified behavior' route is the best - it allows an implementer to choose between at least three different behaviors that I would consider reasonable: interpret as matching only the characters in [<name>:] so that existing implementations need not change; interpret as [[:<name>:]] to match tr; or issue a warning/error to alert users to the non-portability aspect.

would this mean an implementation could add a new<regex.h>  error code?

It may indeed be worthwhile to reword the bug 305 proposal to allow an implementation to extend <regex.h> in that manner, although it is not strictly necessary (GNU grep's warning was implemented without the need of a new <regex.h> error code).

--
Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]