bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#15483: POSIXLY_CORRECT documentation vis a vis some simple EREs


From: Eric Blake
Subject: bug#15483: POSIXLY_CORRECT documentation vis a vis some simple EREs
Date: Mon, 30 Sep 2013 08:21:30 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130923 Thunderbird/17.0.9

On 09/29/2013 11:25 AM, Glenn Golden wrote:
> --
> Eric Blake <address@hidden> Sat, 28 Sep 2013 17:17:50 -0600:
>>
>> the intent of POSIXLY_CORRECT is only to change behavior where we do
>> not comply with the requirements by default.
>>
> 
> OK, that's a good start at clarifying the intent of POSIXLY_CORRECT, and glad
> to see it.  IMO it would be worthwhile for us to hash this out further, if
> you're not against doing so, since it has (to me) always been a bit ambiguous
> as to what it implies in specific tools. Maybe others would benefit from it 
> too.

You are welcome to submit documentation patches to clarify this as you
see fit (I'm probably a bit too biased with the current status quo to
write such a patch myself, as I'm liable to overlook what seems obvious
to me but not so obvious to a newcomer).

>    3. Is grep an 'application' as defined by POSIX:   yes
> 
>       3.17 Application: A computer program that performs some
>       desired function

What are the desired functions of GNU grep? I claim that ONE of the
desired functions is to be usable as an implementation that can serve in
a strictly-conforming POSIX environment with known behavior.  I also
claim that another desired function of GNU grep is to provide useful
extensions beyond the bare minimum specified by POSIX.

Next, it is required that the implementation of grep be limited to
solely a strictly conforming POSIX application, in order to provide its
functionality of being used in a strictly conforming environment?
Heavens no.  Just because you CAN write a program in a strictly
conforming environment does not imply that you MUST write a program in
that limited environment, and we'd much rather write grep to take
advantage of extensions outside the realm of POSIX.  Doing so makes our
life easier as maintainers, and makes grep more useful in its target
audience of GNU/Linux where people have come to expect having extensions
enabled by default.

Case in point: have you ever typed 'grep --help' to see what grep can
do, or 'grep --version' to see what build you are using?  Well, such a
command line is not permissible of a strictly conforming use of grep
(ie. POSIX states that it is undefined behavior to pass a command line
argument that has a length larger than 2 and which starts with two
dashes).  And we implement it in grep via the non-POSIX getopt_long()
function (initially exported by glibc, but made available via the gnulib
project for use on other platforms where libc has not made that
extension).  So our handling of your non-compliant input is best done by
using a non-compliant function within our code.

>    4. Does grep 'use' the ERE '*xyz'                  yes

Only insofar as GNU grep has decided to give defined semantics, as an
extension, in its handling of what POSIX has declared as undefined.  But
if you only ever pass strictly-conforming input to grep, you will never
know what grep would have done with your input.  And if you pass '*xyz'
to grep, you are already outside the realm of strict conformance, so you
no longer has POSIX as your argument on what should happen, and
therefore, POSIXLY_CORRECT should have no bearing on that effect.

> 
> Now let's ask what is probably the more relevant question: Can an application
> which is non-conforming be part of a conforming implementation?

Yes, insofar as it's behavior when limited to conforming input is
likewise conforming.  POSIX specifically permits extensions.  If ALL
implementations were REQUIRED to reject '*xyz', then the behavior of
'*xyz' would be well-defined, and then you could talk about conformance
issues if GNU grep didn't reject it.  But by very definition, POSIX
cannot define what happens on undefined input, and the very reason that
POSIX documents some things as undefined is precisely to allow for
extensions.

In all of this, you seem to be asking for a mode of operation in grep
that explicitly rejects all POSIX-undefined behavior as an immediate
error.  GNU sed has tried to implement such an approach - if you use
'sed --posix', that tries as hard as possible to gracefully reject
anything that is not strictly conforming; and which is a different mode
than 'POSIXLY_CORRECT=1 sed' where extensions are still left enabled and
the only changes are when behavior is not compliant by default.  If you
wanted to, you could submit a patch to implement 'grep --posix' which
explicitly rejected all undefined POSIX behavior (well, all except for
the fact that the very command line to enter such a mode is via an
extension to POSIX).  But I'm not going to write such a patch myself (I
see very little benefit to be gained compared to the cost of maintaining
the patch).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]