[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 5/6] canonicalize: prefer signed integer types
From: |
Bruno Haible |
Subject: |
Re: [PATCH 5/6] canonicalize: prefer signed integer types |
Date: |
Thu, 03 Dec 2020 22:27:34 +0100 |
User-agent: |
KMail/5.1.3 (Linux/4.4.0-193-generic; KDE/5.18.0; x86_64; ; ) |
Paul Eggert wrote:
> More generally, when I'm reviewing code I naturally look for relationships
> like
> 0 <= i < j < n. I can see where one might want to say "j is of type i+1 ..
> n-1"
> but all things considered it'd be better for the compiler and/or human reader
> to
> infer that sort of thing, than to clutter the code with something like "int
> __attribute__ ((range (i+1 .. n-1))) j;" when declaring j.
The human reader, which is what I care about here, is not good at doing this
without a hint. When I studied computer science (3rd semester), we were given
a piece of code like this (for binary search within a sorted array)
size_t hi = table_size;
size_t lo = 0;
while (lo < hi)
{
size_t mid = (hi + lo) >> 1;
int cmp = strcmp (table[mid].alias, codeset);
if (cmp < 0)
lo = mid + 1;
else if (cmp > 0)
hi = mid;
else
{
/* Found an i with
strcmp (table[i].alias, codeset) == 0. */
codeset = table[mid].canonical;
goto done_table_lookup;
}
}
and asked to fill in the invariants that are necessary to prove that the
code produces the expected result. Many students didn't succeed. And
while I succeed in general in this kind of task, I sometimes make off-by-one
mistakes that cause an endless loop or wrong results, if I don't write
down the invariants as comments.
So, some way of writing it down is necessary. Whether as a comment with
math notation, or an __attribute__ ((range ...)), I don't really mind.
> Of course "idx_t j;"
> is much less clutter than the __attribute__ stuff but it's not clear that
> it's
> worth the bother to have yet another integer type for this sort of thing.
In the example
ptrdiff_t dest_offset = dest - rpath;
the benefit of writing 'idx_t' was high. But even in simpler code like
idx_t k = strlen (s);
I would have to
1. verify that k is not assigned later, so the initialization provides
the definitive value,
2. understand that strlen always returns values >= 0.
In this case the benefit of writing 'idx_t' is not large. But it sums up.
20 years ago, many people probably thought the same thing ("worth the bother
to have yet another integer type?") about 'bool', versus just writing 'int'.
Now that the hassles with <stdbool.h> portability are resolved, I find that
using 'bool' is definitely worth it.
Bruno
- [PATCH 4/6] canonicalize: fix most of another EOVERFLOW issue, (continued)
[PATCH 6/6] canonicalize: refactor can_mode flag, Paul Eggert, 2020/12/02
[PATCH 5/6] canonicalize: prefer signed integer types, Paul Eggert, 2020/12/02
Re: [PATCH 1/6] canonicalize-lgpl: fix EOVERFLOW bug, Adhemerval Zanella, 2020/12/11
- Re: [PATCH 1/6] canonicalize-lgpl: fix EOVERFLOW bug, Paul Eggert, 2020/12/17
- Re: [PATCH 1/6] canonicalize-lgpl: fix EOVERFLOW bug, Adhemerval Zanella, 2020/12/18
- Re: [PATCH 1/6] canonicalize-lgpl: fix EOVERFLOW bug, Adhemerval Zanella, 2020/12/18
- Re: [PATCH 1/6] canonicalize-lgpl: fix EOVERFLOW bug, Paul Eggert, 2020/12/18
- Re: [PATCH 1/6] canonicalize-lgpl: fix EOVERFLOW bug, Paul Eggert, 2020/12/24
- Re: [PATCH 1/6] canonicalize-lgpl: fix EOVERFLOW bug, Tom G. Christensen, 2020/12/31
- Re: [PATCH 1/6] canonicalize-lgpl: fix EOVERFLOW bug, Bruno Haible, 2020/12/31