Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precisi

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precisi

From:	Eric Blake
Subject:	Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
Date:	Fri, 5 Feb 2021 08:36:06 -0600
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0

On 2/5/21 5:34 AM, Daniel P. Berrangé wrote:
> On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote:
>> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
>> the keyval visitor), and it gets annoying that edge-case testing is
>> impacted by implicit rounding to 53 bits of precision due to parsing
>> with strtod().  As an example posted by Rich Jones:
>>  $ nbdkit memory $(( 2**63 - 2**30 )) --run \
>>    'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>>  write failed: Input/output error
>>

>> @@ -1970,7 +1977,7 @@ static void test_qemu_strtosz_simple(void)
>>      g_assert_cmpint(err, ==, 0);
>>      g_assert_cmpint(res, ==, 12345);
>>
>> -    /* Note: precision is 53 bits since we're parsing with strtod() */
>> +    /* Note: If there is no '.', we get full 64 bits of precision */
> 
> IIUC, our goal is that we should never loose precision for the
> non-fractional part. 

Correct. Maybe I can tweak that comment more as part of splitting this
patch into testsuite improvements (highlighting existing shortfalls)
then the actual implementation change (with its corresponding testsuite
tweaks to demonstrate how it is fixed).

>> +
>> +    str = "0xa";
>> +    err = qemu_strtosz(str, &endptr, &res);
>> +    g_assert_cmpint(err, ==, 0);
>> +    g_assert_cmpint(res, ==, 10);
>> +    g_assert(endptr == str + 3);
> 
> I'd stick a  '0xab' or "0xae" in there, to demonstrate that we're
> not accidentally applyin the scaling units for "b" and "e" in hex.

Will do.

>> +    str = "0x1.8k";
>> +    err = qemu_strtosz(str, &endptr, &res);
>> +    g_assert_cmpint(err, ==, -EINVAL);
>> +    g_assert(endptr == str);
>>  }
> 
> A test case to reject negative numers ?

We already have one, under the _erange section.  Then again, there's an
oddity: with ERANGE, we tend to leave endptr pointing to the end of what
we did parse, while with EINVAL, we tend to leave endptr pointing to
nptr saying we parsed nothing at all.  It's hard to decide whether -1 is
an ERANGE (negative is not permitted, but we successfully parsed two
characters and so advanced endptr), or an EINVAL (we don't allow
negative at all, so endptr is nptr).  If anyone has preferences, now
would be the time to document the semantics we want; although since we
already have an existing test under _range for -1, I tried to keep that
behavior unchanged.

>> -    retval = qemu_strtod_finite(nptr, &endptr, &val);
>> +    /* Parse integral portion as decimal. */
>> +    retval = qemu_strtou64(nptr, &endptr, 10, &val);
>>      if (retval) {
>>          goto out;
>>      }
>> -    fraction = modf(val, &integral);
>> -    if (fraction != 0) {
>> -        mul_required = 1;
>> +    if (strchr(nptr, '-') != NULL) {
>> +        retval = -ERANGE;
>> +        goto out;
>> +    }
>> +    if (val == 0 && (*endptr == 'x' || *endptr == 'X')) {
> 
> This works as an approach but it might be more clear to
> just check for hex upfront.
> 
>   if (g_str_has_prefix("0x", val) ||
>       g_str_has_prefix("0X", val)) {

Won't work nicely.  strtoull() allows for leading whitespace, at which
point we are duplicating a lot of functionality to likewise skip such
leading whitespace before checking the prefix.

>       ... do hex parse..
>   } else {
>       ... do dec parse..
>   }
>       
> 
>> +        /* Input looks like hex, reparse, and insist on no fraction. */
>> +        retval = qemu_strtou64(nptr, &endptr, 16, &val);
>> +        if (retval) {
>> +            goto out;
>> +        }
>> +        if (*endptr == '.') {
>> +            endptr = nptr;
>> +            retval = -EINVAL;
>> +            goto out;
>> +        }
>> +    } else if (*endptr == '.') {
>> +        /* Input is fractional, insist on 0 <= fraction < 1, with no 
>> exponent */
>> +        retval = qemu_strtod_finite(endptr, &endptr, &fraction);
>> +        if (retval) {
>> +            endptr = nptr;
>> +            goto out;
>> +        }
>> +        if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr)
>> +            || memchr(nptr, 'E', endptr - nptr)) {
> 
> Can this be simplified ?  Fraction can be > 1, if-and only-if exponent
> syntax is used IIUC.

True for > 1.0, but false for == 1.0, because of the possibility of
overflow while parsing a fraction like .99999999999999999999999999999999

> 
> Shouldn't we be passing 'endptr+1' as the first arg to memchr ?

Not quite.  I used endptr as both input and output in the strtod() call,
so at this point, I'd have to add another variable to track where the
'.' since endptr is no longer pointing there.  But being lazy, I know
that the integral portion contains no 'e', so even though the memchr is
wasting effort searching bytes before the '.' for 'e', it is not wrong.

> 
> nptr points to the leading decimal part, and we only need to
> scan the fractional part.
> 
> Also what happens with   "1.1e" - that needs to be treated
> as a exabyte suffix, and not rejected as an exponent. We
> ought to test that if we don't already.

Agree that we need to support it (well, depending on patch 3, it may be
cleaner to write the test to look for support for 1.5e).  Will make sure
the test covers it.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision, (continued)
- [PATCH 2/3] utils: Deprecate hex-with-suffix sizes, Eric Blake, 2021/02/04
  - Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes, Vladimir Sementsov-Ogievskiy, 2021/02/05
    - Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes, Richard W.M. Jones, 2021/02/05
    - Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes, Eric Blake, 2021/02/05
  - Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes, Daniel P . Berrangé, 2021/02/05
    - Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes, Eric Blake, 2021/02/05
    - Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes, Daniel P . Berrangé, 2021/02/05
- [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes, Eric Blake, 2021/02/04
  - Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes, Eric Blake, 2021/02/04
  - Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes, Richard W.M. Jones, 2021/02/05

Prev by Date: Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes
Next by Date: Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
Previous by thread: Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
Next by thread: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes
Index(es):
- Date
- Thread