We have quite a few undertested and underdocumented integer parsing
corner cases. To ensure that any changes we make in the code are
intentional rather than accidental semantic changes, it is time to add
more unit tests of existing behavior.
In particular, this demonstrates that parse_uint() and qemu_strtou64()
behave differently. For "-0", it's hard to argue why parse_uint needs
to reject it (it's not a negative integer), but the documentation sort
of mentions it; but it is intentional that all other negative values
are treated as ERANGE with value 0 (compared to qemu_strtou64()
treating "-2" as success and UINT64_MAX-1, for example).
Also, when mixing overflow/underflow with a check for no trailing
junk, parse_uint_full favors ERANGE over EINVAL, while qemu_strto[iu]*
favor EINVAL. This behavior is outside the C standard, so we can pick
whatever we want, but it would be nice to be consistent.
Note that C requires that "9223372036854775808" fail strtoll() with
ERANGE/INT64_MAX, but "-9223372036854775808" pass with INT64_MIN; we
weren't testing this. For strtol(), the behavior depends on whether
long is 32- or 64-bits (the cutoff point either being the same as
strtoll() or at "-2147483648"). Meanwhile, C is clear that
"-18446744073709551615" pass stroull() (but not strtoll) with value 1,
even though we want it to fail parse_uint(). And although
qemu_strtoui() has no C counterpart, it makes more sense if we design
it like 32-bit strtoul() (that is, where "-4294967296" be an alternate
acceptable spelling for "1", but "-0xffffffff00000001" should be
treated as overflow and return 0xffffffff rather than 1). We aren't
there yet, so some of the tests added in this patch have FIXME
comments.
However, note that C2x will (likely) be adding a SILENT semantic
change, where C17 strtol("0b1", &ep, 2) returns 0 with ep="b1", but
C2x will have it return 1 with ep="". I did not feel like adding
testing for those corner cases, in part because the next version of C
is not standard and libc support for binary parsing is not yet
wide-spread (as of this patch, glibc.git still misparses bare "0b":
https://sourceware.org/bugzilla/show_bug.cgi?id=30371).
Signed-off-by: Eric Blake <eblake@redhat.com>
---
v3: use cmpuint in more places [Hanna], expose another strtoui flaw
and add compaanion tests to strtoul, expand commit message, R-b dropped
---
tests/unit/test-cutils.c | 929 ++++++++++++++++++++++++++++++++++++---
1 file changed, 864 insertions(+), 65 deletions(-)
+ endptr = "somewhere";
+ res = 999;
+ err = qemu_strtol(str, &endptr, 0, &res);
+ g_assert_cmpint(err, ==, -ERANGE);
+ g_assert_cmpint(res, ==, LONG_MIN);
+ g_assert_true(endptr == str + strlen(str));
+ }
static void test_qemu_strtoul_underflow(void)
{
- const char *str = "-99999999999999999999999999999999999999999999";
- char f = 'X';
- const char *endptr = &f;
- unsigned long res = 999;
+ const char *str;
+ const char *endptr;
+ unsigned long res;
int err;
+ /* 1 less than -ULONG_MAX */
+ str = ULONG_MAX == UINT_MAX ? "-4294967296" : "-18446744073709551616";
+ endptr = "somewhere";
+ res = 999;
err = qemu_strtoul(str, &endptr, 0, &res);
+ g_assert_cmpint(err, ==, -ERANGE);
+ g_assert_cmpint(res, ==, ULONG_MAX);