[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Two kinds of precision loss
From: |
Greg Chicares |
Subject: |
Re: [lmi] Two kinds of precision loss |
Date: |
Thu, 23 Mar 2017 00:27:46 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.6.0 |
On 2017-03-22 15:44, Vadim Zeitlin wrote:
> On Tue, 21 Mar 2017 22:34:35 +0000 Greg Chicares <address@hidden> wrote:
>
> GC> There are three problems in mapping between floating and integral types:
> GC>
> GC> (1) Loss of range, e.g., DBL_MAX --> int: UB; definitely not wanted.
>
> Agreed.
>
> GC> (2) Truncation, e.g., M_PI --> int: we agree that we don't want this.
>
> And one of the reasons for this is that this conversion is possibly
> ambiguous, e.g. converting 3.5 to int could be reasonably expected to give
> either 3 or 4 and there is no universally correct answer.
To link this point to your next: here we have a choice among various
answers, and that choice is best expressed as a parameter to a rounding
function. And bourn_cast() and round_to() are separate functions because
they address separate concerns.
> GC> (3) Loss of precision, e.g., ULLONG_MAX --> float: here, static_cast
> GC> gives a well-defined result [4.9/2] that is as good an approximation
> GC> as it can be, plus or minus one ulp; a sufficiently wide hypothetical
> GC> "long long double" type could be exact. Is this okay for bourn_cast?
>
> It is if and only if we consider the reason above the only important one:
> then, this case is fine because you really can't expect to get anything
> else, i.e. there is no ambiguity.
Here (unlike case (2)) there is no choice for us to make. (The compiler
chooses which neighboring value to return, and we can't control that.)
> However if you also expect the cast to be
> round-trip safe, then this one is still not OK.
>
> Of course, this just replaces one question with another: do we need
> round-trip safety from bourn_cast? Personally I think it's nice to have but
> I can't find any reason to absolutely require it here.
I'm not sure a round-trip guarantee is even feasible. In the case under
present discussion:
ULLONG_MAX-2 --> float
a round trip would ideally get us back to the original type:
ULLONG_MAX-2 --> float --> unsigned long long
I'm testing two different implementations: boost::numeric_cast and my
own experimental "branch". Here are the respective behaviors:
- boost::numeric_cast does this: "ULLONG_MAX-2 --> float --> throw!"
18446744073709551613 = ULLONG_MAX-2 = 2^64-3
18446744073709551616 = boost::numeric_cast<float>(ULLONG_MAX-2)
and (somehow) it decides that's too large to convert back
You take the Bakerloo line from Elephant & Castle to Charing Cross,
but you can't go back?
- mine does this: "ULLONG_MAX-2 --> float --> zero!"
18446744073709551613 = ULLONG_MAX-2 = 2^64-3
18446744073709551616 = bourn_cast<float>(ULLONG_MAX-2)
0 = bourn_cast<unsigned long long>(bourn_cast<float>(ULLONG_MAX-2))
You arrive at Charing Cross, but when you reverse your journey, you
wind up at Harrow & Wealdston, because that's congruent to Elephant
& Castle (modulo one ulp) modulo (number of stations on the line)?
Neither of these behaviors seems ideal.
> GC> Let me share a curious example that arose when I tried to write a unit
> GC> test for this exact case. I'm using
> GC> float = 32-bit IEEE 754
> GC> unsigned long long = 64-bit integer
> GC>
> GC> snprintf() results:
> GC>
> GC> 18446744073709551615 == 2^64 - 1 = ULLONG_MAX
> GC> 18446744073709551616 == static_cast<float>(ULLONG_MAX)
> GC> ^different only in the last digit shown
> GC>
> GC> Cast either to the type of the other, and they compare equal. I think
> GC> I'll wait until April Fools' Day and report this here:
> GC> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
BTW, IEEE 754-2008 was approved on April 1:
http://grouper.ieee.org/groups/754/email/msg04172.html
> This would be really cruel. But not as cruel as using this as an interview
> question for C++ programmers -- I'm pretty sure that if I was asked whether
> there exist 2 values which compare equal after a cast to either of their
> types but are still different, I would answer "no" almost without thinking.
An easier question: what floating-point value V compares unequal to itself?
The answer leads to this curiosity:
float const f_qnan = std::numeric_limits<float>::quiet_NaN();
std::cout << boost::numeric_cast<unsigned long long int>(f_qnan) << "\n";
That prints:
9223372036854775808
If you numeric_cast that back to float, it retains that value: it doesn't
become a NaN (there's no way it could set the exponent to 0xFF).
My own implementation refuses to convert a NaN (simply because I thought
about it and added special-case handling; boost could do the same).
Oh, and here's another question. What happens if you try
static_cast<float>(DBL_MAX)?
C++11 [4.8/1] says "If the source value is between two adjacent destination
values, the result of the conversion is an implementation-defined choice of
either of those values. Otherwise, the behavior is undefined." So is this
cast UB? Or is DBL_MAX "adjacent" to infinity?
And what about
static_cast<float>((double)INFINITY)
? AIUI, the C and C++ standards supposedly defer to the floating-point
standard, and IEEE 754-1985 [6.1] defines "conversion of an infinity into
the same infinity in another format" as an operation that signals no
exceptions, so I think this is supposed to be well defined. OTOH, it's
not "between two adjacent destination values", so is it UB?
Can it be that
DBL_MAX is too big to convert to float, but
INFINITY is not too big to " " "
even though DBL_MAX < INFINITY?
> GC> But seriously...we agree that bourn_cast should throw in case (1)
> GC> above, and should throw also in case (2); but in case (3), should it
> GC> return float(ULLONG_MAX)?
> GC>
> GC> Initially at least, I think the answer should be "yes". Otherwise,
> GC> a double cannot be converted to float unless its last (53-24)
> GC> mantissa bit are all zero, which has a 1 / 536870912 probability
> GC> assuming a uniform distribution.
>
> The example of double->float conversion is just another way to say that
> the answer to your question (3) should be "yes" iff we don't require
> round-trip safety.
A round-trip guarantee would forbid casting almost any double to float.
I think that demonstrates that such a guarantee is too restrictive to
be useful.
> Pragmatically speaking, my preferred solution would be to not solve this
> problem at all unless we really have to. So for me the immediate question
> is: do we need to allow double-to-float conversions in bourn_cast<>?
Here's the decision tree as I see it:
- Abandon the goal that value_cast should convert anything to anything,
including floating <-> integral? I'm not willing to do that.
- Use bourn_cast only for integral types, and...
- keep boost::numeric_cast for floating <-> integral? no, my original
motivation was to stop using this boost library; or...
- write another replacement to handle floating <-> integral conversion?
no, I think it makes more sense to try to add that into bourn_cast,
which can always be refactored into two separate pieces later.
> If we
> do, then we will have to live with precision loss anyhow and so we can
> accept in the case (3) as well. But if, as my hope is, we don't actually
> need them right now
To realize that hope, we'd need to keep boost::numeric_cast indefinitely.
- [lmi] Is bourn_cast demonstrably correct?, Greg Chicares, 2017/03/19
- Re: [lmi] Is bourn_cast demonstrably correct? (+PATCH), Vadim Zeitlin, 2017/03/20
- Re: [lmi] Is bourn_cast demonstrably correct? (+PATCH), Greg Chicares, 2017/03/20
- Re: [lmi] Is bourn_cast demonstrably correct? (+PATCH), Vadim Zeitlin, 2017/03/20
- Re: [lmi] Is bourn_cast demonstrably correct? (+PATCH), Greg Chicares, 2017/03/20
- Re: [lmi] Is bourn_cast demonstrably correct?, Vadim Zeitlin, 2017/03/21
- Re: [lmi] Is bourn_cast demonstrably correct?, Greg Chicares, 2017/03/21
- Re: [lmi] Is bourn_cast demonstrably correct?, Vadim Zeitlin, 2017/03/21
- [lmi] Two kinds of precision loss [Was: Is bourn_cast demonstrably correct?], Greg Chicares, 2017/03/21
- Re: [lmi] Two kinds of precision loss, Vadim Zeitlin, 2017/03/22
- Re: [lmi] Two kinds of precision loss,
Greg Chicares <=
- Re: [lmi] Two kinds of precision loss, Vadim Zeitlin, 2017/03/22
- Re: [lmi] Two kinds of precision loss, Greg Chicares, 2017/03/24
- [lmi] Is DBL_MAX "adjacent" to infinity? [Was: Two kinds of precision loss], Greg Chicares, 2017/03/24
- Re: [lmi] Is DBL_MAX "adjacent" to infinity?, Vadim Zeitlin, 2017/03/24
- Re: [lmi] Is DBL_MAX "adjacent" to infinity?, Greg Chicares, 2017/03/24
- Re: [lmi] Is DBL_MAX "adjacent" to infinity?, Vadim Zeitlin, 2017/03/24