[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 07/28] softfloat: Move sqrt_float to softfloat-parts.c.inc
From: |
Alex Bennée |
Subject: |
Re: [PATCH v2 07/28] softfloat: Move sqrt_float to softfloat-parts.c.inc |
Date: |
Thu, 03 Jun 2021 10:17:37 +0100 |
User-agent: |
mu4e 1.5.13; emacs 28.0.50 |
Richard Henderson <richard.henderson@linaro.org> writes:
> Rename to parts$N_sqrt.
> Reimplement float128_sqrt with FloatParts128.
>
> Reimplement with the inverse sqrt newton-raphson algorithm from musl.
> This is significantly faster than even the berkeley sqrt n-r algorithm,
> because it does not use division instructions, only multiplication.
>
> Ordinarily, changing algorithms at the same time as migrating code is
> a bad idea, but this is the only way I found that didn't break one of
> the routines at the same time.
I can't pretend to follow the details of the method as well as I could
the original but that's why we have tests so if they are happy I'm
happy:
Tested-by: Alex Bennée <alex.bennee@linaro.org>
<snip>
> +
> + if (N == 64) {
> + /* float64 or smaller */
> +
> + r32 = ((uint64_t)r32 * u32) >> 31;
> + /* |r*sqrt(m) - 1| < 0x1.7Bp-16 */
> +
> + s32 = ((uint64_t)m32 * r32) >> 32;
> + d32 = ((uint64_t)s32 * r32) >> 32;
> + u32 = three32 - d32;
> +
> + if (fmt->frac_size <= 23) {
> + /* float32 or smaller */
> +
> + s32 = ((uint64_t)s32 * u32) >> 32; /* 3.29 */
> + s32 = (s32 - 1) >> 6; /* 9.23 */
> + /* s < sqrt(m) < s + 0x1.08p-23 */
> +
> + /* compute nearest rounded result to 2.23 bits */
> + uint32_t d0 = (m32 << 16) - s32 * s32;
> + uint32_t d1 = s32 - d0;
> + uint32_t d2 = d1 + s32 + 1;
> + s32 += d1 >> 31;
> + a->frac_hi = (uint64_t)s32 << (64 - 25);
> +
> + /* increment or decrement for inexact */
> + if (d2 != 0) {
> + a->frac_hi += ((int32_t)(d1 ^ d2) < 0 ? -1 : 1);
> + }
> + goto done;
> + }
> +
> + /* float64 */
> +
> + r64 = (uint64_t)r32 * u32 * 2;
> + /* |r*sqrt(m) - 1| < 0x1.37-p29; convert to 64-bit arithmetic */
> + mul64To128(m64, r64, &s64, &discard);
> + mul64To128(s64, r64, &d64, &discard);
> + u64 = three64 - d64;
> +
> + mul64To128(s64, u64, &s64, &discard); /* 3.61 */
> + s64 = (s64 - 2) >> 9; /* 12.52 */
> +
> + /* Compute nearest rounded result */
> + uint64_t d0 = (m64 << 42) - s64 * s64;
> + uint64_t d1 = s64 - d0;
> + uint64_t d2 = d1 + s64 + 1;
> + s64 += d1 >> 63;
> + a->frac_hi = s64 << (64 - 54);
> +
> + /* increment or decrement for inexact */
> + if (d2 != 0) {
> + a->frac_hi += ((int64_t)(d1 ^ d2) < 0 ? -1 : 1);
> + }
> + goto done;
> + }
I usually take more convincing about gotos but I can't see a way of
doing it more neatly so have a:
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
as well :-)
--
Alex Bennée
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [PATCH v2 07/28] softfloat: Move sqrt_float to softfloat-parts.c.inc,
Alex Bennée <=