qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] powerpc: fix denorm float->double conversion


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH] powerpc: fix denorm float->double conversion
Date: Mon, 8 Apr 2019 08:58:15 -1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

On 3/23/19 12:24 PM, Sergei Trofimovich wrote:
> Here denormalization conversion has a few bugs:
> - significand (abs_arg) has 32-bit unsigned wraparound in
>     ret |= abs_arg << (shift + 29);
> - significand does not drop explicit leading '1' in denorm
>   'float' when converting to normalized 'double'
> - significand had an off-by-one shift

Correct on all points.  Thanks for the test case and analysis.


> +        /*
> +         * Conversion mechanics:
> +         * float denorm (2^(-126) - biased):
> +         *    [ sign (1 bit) | exp32 (8 bits)  | sign32 (23 bits) ]
> +         *                 s                0    0001abc...def

FWIW, the overlap between "sign" and "significand" is why I prefer the term
"fraction", even though the term itself is less precise.


>          if (unlikely(abs_arg != 0)) {
>              /* Denormalized operand.  */
> -            int shift = clz32(abs_arg) - 9;
> -            int exp = -126 - shift + 1023;
> -            ret |= (uint64_t)exp << 52;
> -            ret |= abs_arg << (shift + 29);
> +            int lz = clz32(abs_arg);
> +            abs_arg &= ~(1 << (31 - lz)); /* [2a.] */
> +
> +            /* shift within sign32 includeing leading '1' */
> +            int shift = lz + 1 - (32 - 23);
> +            int exp = -126 + 1023 - shift; /* [2b]. */
> +            ret |= (uint64_t)exp << 52; /* [3.] */
> +            ret |= (uint64_t)abs_arg << (52 - 23 + shift); /* [4.] */

I think perhaps using deposit makes things clearer, since we don't have to
explicitly remove the msb in that case:

E.g.

@@ -67,10 +67,10 @@ uint64_t helper_todouble(uint32_t arg)
         ret = (uint64_t)extract32(arg, 31, 1) << 63;
         if (unlikely(abs_arg != 0)) {
             /* Denormalized operand.  */
-            int shift = clz32(abs_arg) - 9;
-            int exp = -126 - shift + 1023;
-            ret |= (uint64_t)exp << 52;
-            ret |= abs_arg << (shift + 29);
+            int msbm1 = 31 - clz32(abs_arg);
+            int exp = 1023 - 126 - (23 - msbm1);
+            ret = deposit64(ret, 52, 11, exp);
+            ret = deposit64(ret, 52 - msbm1, msbm1, abs_arg);


Thoughts?


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]