bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gawk substr() problem


From: Paul Eggert
Subject: Re: gawk substr() problem
Date: Thu, 21 Nov 2002 12:20:44 -0800

> From: Aharon Robbins <address@hidden>
> Date: Thu, 21 Nov 2002 16:13:22 +0200
> 
> How does casting -1 to a size_t improve anything?  size_t is unsigned
> long (or long long), so you just end up with MAX_ULONG or some such.
> Wouldn't just (size_t) ~0  be a better value?

The C standard guarantees that size_t is unsigned, and that unsigned
arithmetic is modulo MAX+1, so (size_t) -1 is guaranteed to equal the
maximum size_t value.  C99 defines SIZE_MAX as a macro, but we can't
assume C99 yet, so it's safer to use (size_t) -1, which works with
both C89 and K&R C.

(size_t) ~0 is equivalent to (size_t) -1 on two's complement hosts,
but on one's complement hosts (size_t) ~0 equals 0 (because ~0 equals
0).  It's unlikely gawk will be running on a one's complement host,
but we might as well be portable if it's easy, and anyway the
"(size_t) -1" notation is the longstanding tradition in C circles.


>         if (t1->stlen == 0) {
> -               if (do_lint)
> +               if (do_lint && (indx | length) != 0)
>                         lintwarn(_("substr: source string is zero length"));
> 
> That's not right: If the source string length is zero, there's
> no reason to check the index and length.  Or are you trying to
> avoid a warning on
> 
>       substr("", 1, 0)
> 
> ?

Yes, that's what I'm trying to avoid.

> Neither of us has done a scientific survey.

True, but I've surveyed a reasonable amout of awk code, including my
own code, which I unfortunately can't publish about, and Gawk itself,
which I reported in an earlier message.  I haven't seen any survey
with contrary results.

> > I didn't know about --lint until recently, but if --lint generates
> > warnings for perfectly-reasonable things like zero-length substrings,
> > I don't wonder that nobody uses it.
> 
> I find this remark rude and uncalled-for, especially on a public
> mailing list.  I think you owe me an apology.

Sorry, I was not intending to be rude.  In rereading the message I
don't see the rudeness that you do; but I do apologize for any harm,
which was certainly unintended.

I understand that no matter how much or how little lint Gawk
generates, somebody's bound to complain.  However, I still think that
Gawk is way over the line in complaining about zero-length substrings.
To my mind that is almost as extreme as reporting a warning if two
numbers sum to zero.  Sure, such a warning would catch real
programming errors (e.g., catastrophic cancellation when a-b == 0),
but the number of false alarms would be way too large for such a
warning to be useful for most programs.  Similarly, warning about
zero-length substrings will not be useful for most real Awk programs,
in my experience.

GCC has many options to enable and disable warnings selectively.
If Gawk did the same, perhaps more people would use --lint..




reply via email to

[Prev in Thread] Current Thread [Next in Thread]