So, per the current trouble ticket
(http://austingroupbugs.net/view.php?id=1198) it looks like the
Austin Group will change the comparison type of numeric string vs
numeric string from String to Numeric in the POSIX spec to match
what gawk does, so all good there.
It looks like they will NOT, however, change their expected behavior
when comparing an empty field to zero. Today gawk treats an empty or
absent field as a null string, not as a zero-or-null numeric-string:
$ echo 'a,,c' | gawk -F, '{print typeof($2), $2, ($2==0
? "==" : "!="), typeof(0), 0}'
string != number 0
$ echo 'a,,c' | gawk -F, '{print typeof($8), $8, ($8==0 ? "==" :
"!="), typeof(0), 0}'
unassigned != number 0
while POSIX requires the behavior that some other awks (e.g. all
awks on Solaris apparently) exhibit which is to treat an empty or
absent field as it would an uninitialized variable:
$ echo 'a,,c' | gawk -F, '{print typeof(foo), foo,
(foo==0 ? "==" : "!="), typeof(0), 0}'
untyped == number 0
$ echo 'a,,c' |
/usr/xpg4/bin/awk -F, '{print ($2==0 ? "==" : "!=")}'
==
I understand why gawk behaves as it does and I think that provides
more intuitive results (especially for the mid-record empty field
case), but it might be a good idea for one of you to comment on the
Austin Group ticket (http://austingroupbugs.net/view.php?id=1198) to
persuade them to define the POSIX behavior to be the way gawk
currently behaves, otherwise gawk would have to behave differently
when invoked with --posix to really be POSIX-compliant and that
difference should be documented in the gawk manual.
Ed.
On 8/7/2018 8:25 AM, Ed Morton wrote:
FYI we
now have related tickets at the Open Group
(https://help.opengroup.org/hc/en-us/requests/193457) and the
Austin Group (http://austingroupbugs.net/view.php?id=1198). The
Open Group one only staff can see, the Austin Group one is visible
to anyone but requires a login to comment on.
Ed.
On 8/5/2018 7:05 AM, Ed Morton wrote:
Arnold - Thanks for getting back to me. I
don’t think anyone’s getting excited about this in the slightest
and I’ll google how to file an interpretation request with the
Open Group, thanks for the suggestion.
Ed Morton
On Aug 5, 2018, at 2:53 AM,
address@hidden wrote:
Hi Ed.
I saw your earlier note also but have not had time to read the
comp.lang.awk
thred in detail.
Ed Morton <address@hidden> wrote:
OK, so apparently gawk really doesn't
behave per the POSIX standard when
comparing numeric-string to numeric-string:
In the Expressions In Awk
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_02>
section POSIX says:
Comparisons (with the '<', "<=", "!=", "==",
'>', and ">=" operators) shall
be made numerically if both operands are numeric, if one
is numeric and the
other has a string value that is a numeric string, or if
one is numeric and
the other has the uninitialized value. Otherwise,
operands shall be
converted to strings as required and a string comparison
shall be made
The text in POSIX is bogus. The intent and prior art are that
as soon as one
operand is a numeric string then a numeric comparison is done.
Otherwise
something as basic as
echo 5.0 10.0 | awk '{ print ($1 < $2) }'
would print 0.
You might want to file an interpretation request with the Open
Group.
I see no reason to:
- get excited
- change gawk's behavior
- issue any warnings
- or update any documentation (except maybe POSIX.STD)
Thanks,
Arnold
|