bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: unassigned/untyped behaviour


From: arnold
Subject: Re: unassigned/untyped behaviour
Date: Tue, 21 Nov 2023 12:32:58 -0700
User-agent: Heirloom mailx 12.5 7/5/10

Hi All.

I finally took at look at this.  "M", thanks for the
report.  Andy, thanks for the test case.

"Andrew J. Schorr" <aschorr@telemetry-investments.com> wrote:

> Hi,
>
> And it's a bit different for a regular variable as opposed to an
> array element. Using the master branch:
>
> ./gawk 'BEGIN {print typeof(a); printf "test %d\n", a; print typeof(a); 
> printf "test2 %s\n", a; print typeof(a)}'

Andy, I have a favor to ask. Please don't send these small test programs
as long one-liners; they become harder to copy/paste that way.

> untyped
> test 0
> unassigned
> test2 
> unassigned
>
> I think that looks right to me, based on the descriptions in 9.1.8.

Yes, it's right.

> But the behavior for array elements does not seem consistent with the manual.
> Section 9.1.8 gives this example:
>
>      ‘"unassigned"’
>           X is a scalar variable that has not been assigned a value yet.
>           For example:
>
>                BEGIN {
>                    # creates a[1] but it has no assigned value
>                    a[1]
>                    print typeof(a[1])  # unassigned
>                }
>
> But in fact, that's not what the master branch produces:

That example is bad. I will fix the doc.

> bash-4.2$ ./gawk 'BEGIN {a[1]; print typeof(a[1])}'
> untyped
>
> I'd guess that gawk is right and the documentation is wrong in that case. 
>
> The coercion to string type by saying 'printf "%s", a[1]' seems like it needs
> some investigation...

The coercion is actually correct.

> bash-4.2$ ./gawk 'BEGIN {print typeof(a[1]); printf "test %s\n", a[1]; print 
> typeof(a[1])}'
> untyped
> test 
> string
>
> I'm not clear on whether accessing a[1] in a scalar context should leave it
> untyped; perhaps it should be unassigned at that point, based on the section
> 9.1.8 descriptions? I imagine the internal implementation details are rather
> complicated...

Indeed.

Let's start. With Andy's original case:

$ cat x.awk
BEGIN {
        a[1]
        print typeof(a[1])
        printf "test %d\n", a[1]
        print typeof(a[1])
        printf "test2 <%s>\n", a[1]
        print typeof(a[1])
}

$ gawk-5.3.0 -f x.awk
untyped
test 0
untyped
test2 <>
string

The problem we're grappling with is that an unadorned `a[1]' is
in a Schroedinger's Cat kind of state. We don't know if it's a
scalar or an array until we use it one way or the other. In the
above test, it gets used as a scalar, so the "untyped" result after
the first printf is incorrect. Consider this modified test case:

$ cat x2.awk 
BEGIN {
        a[1]
        print typeof(a[1])
        printf "test %d\n", a[1]        # <-- used as scalar!
        print typeof(a[1])
        a[1][2] = 5             # <-- should crap out here
        print typeof(a[1])
        printf "test2 %s\n", a[1]
        print typeof(a[1])
}

$ gawk-5.3.0 -f x2.awk 
untyped
test 0
untyped
array
gawk-5.3.0: x2.awk:8: fatal: attempt to use array `a["1"]' in a scalar context

Oops! a[1] gets turned into an array without a qualm. Not good.

After fixing it:

$ ./gawk -f x2.awk 
untyped
test 0
number
gawk: x2.awk:6: fatal: attempt to use scalar `a["1"]' as an array

What about non-arrays? Andy's second example:

$ cat y.awk
BEGIN {
        print typeof(a)
        printf "test %d\n", a
        print typeof(a)
        printf "test2 %s\n", a
        print typeof(a)
}
$ gawk-5.3.0 -f y.awk
untyped
test 0
unassigned
test2 
unassigned
$ ./gawk -f y.awk
untyped
test 0
unassigned
test2 
unassigned

So here we're good. `a' is never treated as an array, so it's
a scalar, but it just hasn't been assigned to.

Yes, there's a difference of behavior in that `a' is unassigned
and `a[1]' becomes a number, but "fixing" that is next to impossible
given the way Node_elem_new works.  Forcing numeric or string
type is the next best thing.

I will work on the documentation. Here is a code fix.

Arnold
-----------------------------------
diff --git a/awk.h b/awk.h
index cbc0a7e8..18bbfc78 100644
--- a/awk.h
+++ b/awk.h
@@ -2006,6 +2006,14 @@ unref(NODE *r)
 static inline NODE *
 force_number(NODE *n)
 {
+       if (n->type == Node_elem_new) {
+               n->type = Node_val;
+               n->flags &= ~STRING;
+
+               assert((n->flags & NUMCUR) != 0);
+
+               return n;
+       }
        return (n->flags & NUMCUR) != 0 ? n : str2number(n);
 }
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]