bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] use of ;; as terminator, request for grammar help


From: Brian Kernighan
Subject: Re: [bug-gawk] use of ;; as terminator, request for grammar help
Date: Fri, 18 Apr 2014 09:03:51 -0400 (EDT)
User-agent: Alpine 2.02 (LRH 1266 2009-07-14)

Hi, all --

Arnold kindly linked me in on this conversation, since we talk about
compatibility issues regularly.

Should multiple semicolons should be legal between pattern-action
statements?  They are legal in my current version of Awk, but it's
entirely an artifact of implementation; I'm pretty sure that Al and
Peter and I would never have written an Awk program to use that
flexibility.  And it seems unlikely that typical Awk programmers
would write code that way either; one semicolon seems like just the
right number.

Should one semicolon be required between pattern-action statements?  I
agree strongly with Arnold on this one: yes.  The language is already
entirely too sloppy in how it uses adjacency to mean something, and
adding more cases seems like a bad idea.  The explicit semicolon between
p-a statements is consistent with the action language, and makes it
clear what's going on when code is written on a single line (as in a
short command-line sequence).  The FIXES note from 1988 makes it clear
that we were very uneasy about allowing an optional separator at the
time; if I were faced with the same decision today, I would require a
single semicolon.

Hope this helps your deliberations a bit.  Thanks for all your good work
on the standarization effort.

Brian


On Thu, 17 Apr 2014, Aharon Robbins wrote:

Hi Eric and Austin Group folks,

I apologize for the delay in replying. Real Life(tm) gets in the way
of these things.

I am cc'ing Brian Kernighan for his opinion on these issues as well.

Date: Thu, 03 Apr 2014 10:18:54 -0600
From: Eric Blake <address@hidden>
To: address@hidden
Cc: Austin Group <address@hidden>
Subject: [bug-gawk] use of ;; as terminator, request for grammar help

Hello GNU awk readers,

On today's Austin Group call (the people in charge of POSIX), we visited
http://austingroupbugs.net/view.php?id=226.

This is in regards to the POSIX awk specification at:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html

Among other things, there were two action items pointed out that this
list might be able to help with:

1. GNU awk has a bug regarding ;; as a terminator.  The POSIX grammar
allows for:
awk '{print};;{print}'
but gawk rejects this case.  This was deemed to be a bug in gawk, since
POSIX was based on the nawk behavior at the time POSIX was standardized,
and nawk has always supported this.

I'm not convinced this is a real bug.  In particular, accidents of the
Unix awk implementation should not necessarily be formally codified
in the standard.  mawk, which was written based on the 1988 awk book,
also does not support this.

If there are awk programs that use this, they should best be changed to
have only one ';', in my humble opinion; there's no real added value
to codifying this into the language.

2. Based on existing implementations, there is consensus that the POSIX
grammar is overly restrictive, and that we should change it to permit:
    awk '{print} {print}'
and:
    awk '/foo/; {print}'

since existing implementations all support it.  But to do that, we need
someone with help in writing grammars to propose the changes to the one
appearing on the POSIX page.  Any input would be appreciated.

I disagree with the first desired change.  The ground I'm standing on here is
firmer. The 1988 awk book disallowed rules without any separators, on the
grounds that rules and statements within them should be syntactically
consistent (a semicolon is required when multiple Xs [rules or statments] appear
on one line).  And the very early released versions of nawk in fact enforced
this rule. (I remember testing against it.)

Later on, after the awk book, Brian changed his awk. If you look at his FIXES
file, you will see:

        Nov 27, 1988:
                With fear and trembling, modified the grammar to permit
                multiple pattern-action statements on one line without
                an explicit separator.  By definition, this capitulation
                to the ghost of ancient implementations remains undefined
                and thus subject to change without notice or apology.
                DO NOT COUNT ON IT.

The sentiment here is quite clear - while it might work, it should
not be formalized.

The gawk documentation follows this example, documenting clearly that
a semicolon is required between multiple rules on one line, and NOT
documenting that it can be left off. I do not plan to change this, either.

The second change (awk '/foo/; { print }') should be supported by the POSIX
grammar, since that is clearly two different rules.

As an aside, there are one or two other areas where gawk implements
undocumented (= unspecified) behavior for compatibility with Unix awk,
but those remain purposely undocumented in the gawk manual; the case
I'm thinking about even has this comment in the code:

        /*
         * A simple_stmt exists to satisfy a constraint in the POSIX
         * grammar allowing them to occur as the 1st and 3rd parts
         * in a `for (...;...;...)' loop.  This is a historical oddity
         * inherited from Unix awk, not at all documented in the AK&W
         * awk book.  We support it, as this was reported as a bug.
         * We don't bother to document it though. So there.
         */

In my humble opinion, the ';;' issue is so trivial that it's not even worth
the effort I put in for simple statements in for loops.

I hope all this helps.  Further discussion is welcome.

Arnold




reply via email to

[Prev in Thread] Current Thread [Next in Thread]