[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH 8/8] softfloat: implement addsub_floats128 using Uint128
From: |
Richard Henderson |
Subject: |
Re: [RFC PATCH 8/8] softfloat: implement addsub_floats128 using Uint128 and new style code |
Date: |
Tue, 20 Oct 2020 11:49:56 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 |
On 10/20/20 9:37 AM, Alex Bennée wrote:
> +static inline FloatParts128 unpack128_raw(FloatFmt fmt, Uint128 raw)
> +{
> + const int sign_pos = fmt.frac_size + fmt.exp_size;
> +
> + return (FloatParts128) {
> + .cls = float_class_unclassified,
> + .sign = extract128(raw, sign_pos, 1),
> + .exp = extract128(raw, fmt.frac_size, fmt.exp_size),
> + .frac = extract128(raw, 0, fmt.frac_size),
> + };
> +}
This use of extract128 for sign and exp will not work for 32-bit. You can't
just automatically truncate from __uint128_t to int in that case.
I don't think we should necessarily create this function, but rather leave it at
> +static inline FloatParts128 float128_unpack_raw(float128 f)
> +{
> + return unpack128_raw(float128_params, uint128_make128(f.low, f.high));
> +}
... this one, and construct the FloatParts128 directly from the float128
components. E.g.
int f_size = float128_params.frac_size;
int e_size = float128_params.exp_size;
return (FloatParts128) {
.sign = extract64(f.high, f_size + e_size - 64, 1);
.exp = extract64(f.high, f_size - 64, e_size);
.frac = extract128(int128_make128(f.low, f.high),
0, f_size);
};
I don't want to over-generalize this just yet.
> +static inline Uint128 pack128_raw(FloatFmt fmt, FloatParts128 p)
> +{
> + const int sign_pos = fmt.frac_size + fmt.exp_size;
> + Uint128 ret = deposit128(p.frac, fmt.frac_size, fmt.exp_size, p.exp);
> + return deposit128(ret, sign_pos, 1, p.sign);
> +}
Likewise, omit this and only have
> +static inline float128 float128_pack_raw(FloatParts128 p)
> +{
> + Uint128 out = pack128_raw(float128_params, p);
> + return make_float128(uint128_gethi(out), uint128_getlo(out));
> +}
this.
> +/* Almost exactly the same as sf_canonicalize except 128 bit */
> +static FloatParts128 sf128_canonicalize(FloatParts128 part, const FloatFmt
> *parm,
> + float_status *status)
I think we may have reached the point of diminishing returns on structure
returns. This is a 196-bit struct, and will not be passed in registers
anymore. It might be better to do
static void sf128_canonicalize(FloatParts128 *part,
const FloatFmt *parm,
float_status *status)
and modify the FloatParts128 in place.
> + bool frac_is_zero = uint128_eq(part.frac, uint128_zero());
With Int128, we'd use !int128_nz().
> +/* As above but wider */
> +static FloatParts128 round128_canonical(FloatParts128 p, float_status *s,
> + const FloatFmt *parm)
> +{
> + /* Do these by hand rather than widening the FloatFmt structure */
> + const Uint128 frac_lsb = uint128_lshift(1, DECOMPOSED128_BINARY_POINT -
> parm->frac_size);
You can't pass constant 1 on 32-bit.
Maybe add a int128_makepow2(exp) function to make this easier?
> + case float_round_nearest_even:
> + overflow_norm = false;
> + inc = ((frac & roundeven_mask) != frac_lsbm1 ? frac_lsbm1 : 0);
Can't use & or != on 32-bit.
> + inc = frac & frac_lsb ? 0 : round_mask;
...
> + if (frac & round_mask) {
...
> + frac += inc;
> + if (frac & DECOMPOSED128_OVERFLOW_BIT) {
> + frac >>= 1;
...
> + frac >>= frac_shift;
...
> + frac = -1;
...
> + if (frac & round_mask) {
> + inc = ((uint128_and(frac, roundeven_mask)) != frac_lsbm1
...
> + if (exp == 0 && frac == 0) {
...
> + frac = 0;
...
> + frac = 0;
and more. There are lots more later.
This is going to get ugly fast. We need another solution.
> +static bool parts128_is_snan_frac(Uint128 frac, float_status *status)
> +{
> + if (no_signaling_nans(status)) {
> + return false;
> + } else {
> + bool msb = extract128(frac, DECOMPOSED128_BINARY_POINT - 1, 1);
Doesn't work for 32-bit. Again, extract128 by itself is not the right
interface. Do we in fact want to share code with the normal parts_is_snan_frac
by just passing in the high-part?
r~
- [RFC PATCH 0/8] fpu: experimental conversion of float128_addsub, Alex Bennée, 2020/10/20
- [RFC PATCH 1/8] softfloat: Use mulu64 for mul64To128, Alex Bennée, 2020/10/20
- [RFC PATCH 2/8] softfloat: Use int128.h for some operations, Alex Bennée, 2020/10/20
- [RFC PATCH 3/8] softfloat: Tidy a * b + inf return, Alex Bennée, 2020/10/20
- [RFC PATCH 4/8] softfloat: Add float_cmask and constants, Alex Bennée, 2020/10/20
- [RFC PATCH 7/8] tests/fp: add quad support to the benchmark utility, Alex Bennée, 2020/10/20
- [RFC PATCH 6/8] int128.h: add bunch of uint128 utility functions (INCOMPLETE), Alex Bennée, 2020/10/20
- [RFC PATCH 8/8] softfloat: implement addsub_floats128 using Uint128 and new style code, Alex Bennée, 2020/10/20
- Re: [RFC PATCH 8/8] softfloat: implement addsub_floats128 using Uint128 and new style code,
Richard Henderson <=
- [RFC PATCH 5/8] softfloat: Inline pick_nan_muladd into its caller, Alex Bennée, 2020/10/20
- Re: [RFC PATCH 0/8] fpu: experimental conversion of float128_addsub, no-reply, 2020/10/20