Re: [avr-libc-dev] [untested PATCH] Save 11 instructions in vfprintf

avr-libc-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-libc-dev] [untested PATCH] Save 11 instructions in vfprintf_flt

From:	George Spelvin
Subject:	Re: [avr-libc-dev] [untested PATCH] Save 11 instructions in vfprintf_flt.o
Date:	8 Dec 2016 13:04:47 -0500

>> + * Unlike "if (src & smask) dst |= dmask", which is also two instructions

> This is confusing because the BST + BLD code below is not a replacement
> for what the C code is indicating.  For example the C code never clears
> the bit as opposed to BLD.

>> + * and two cycles, this overwrites the destination bit (clearing it
>> + * if necessary), and has fewer constraints; it can operate on the low
>> + * 16 registers.

That's *exactly* what I was trying to say in the rest of the sentence you
quoted back to me!  Perhaps I should just give the equivalent C code.

> + */
> +#define COPYBIT(dst, dmask, src, smask)      \
> +    asm(     "bst %2,%3"             \
> +     "\n     bld %0,%1"              \
> +     : "=r" (dst)                    \

> This is wrong because the old value of dst does not die here:
> all bits except %1 are surviving. Correct constraint is "+r".

Yes, I notice that myself a few minutes after posting.  It didn't
seem earth-shaking enough to warrant a followup.

It's now:

/*
 * Copy bit (src & smask) to (dst & dmask).  This expands to a pair of
 * bst/bld instructions which transfer the bit via the T register.
 *
 * Equivalent to "dst &= ~dmask; if (src & smask) dst |= dmask;", but is
 * only two instructions and has fewer constraints; it can operate on
 * the low 16 registers.
 */
#ifdef __BUILTIN_AVR_INSERT_BITS
/*
 * Using a GCC builtin is preferable; it gives the optimizer more
 * information and saves a few more bytes.
 *
 * The first argument to __builtin_avr_insert_bits is an array of 8
 * nibbles, each of which indicates the source of the corresponding
 * bit of the resilt.  All are 0xf (return the corresponding dst bit
 * unchanged) except the one corresponding to dmask, which specifies
 * the bit position in src to copy from.
 */
#define COPYBIT(dst, dmask, src, smask) \
    ((dst) = __builtin_avr_insert_bits( \
        ~((15ul-ntz(smask))*(dmask)*(dmask)*(dmask)*(dmask)), \
        src, dst))
#else
#define COPYBIT(dst, dmask, src, smask) \
    asm(        "bst %2,%3"             \
        "\n     bld %0,%1"              \
        : "+r" (dst)                    \
        : "n" (ntz(dmask)),             \
          "r" (src),                    \
          "n" (ntz(smask)))
#endif

[Prev in Thread]

Current Thread

[Next in Thread]

[avr-libc-dev] [untested PATCH] Save 11 instructions in vfprintf_flt.o, George Spelvin, 2016/12/07
- Re: [avr-libc-dev] [untested PATCH] Save 11 instructions in vfprintf_flt.o, Georg-Johann Lay, 2016/12/08
  - Re: [avr-libc-dev] [untested PATCH] Save 11 instructions in vfprintf_flt.o, George Spelvin <=

Prev by Date: Re: [avr-libc-dev] [untested PATCH] Save 11 instructions in vfprintf_flt.o
Next by Date: [avr-libc-dev] Looking to contribute
Previous by thread: Re: [avr-libc-dev] [untested PATCH] Save 11 instructions in vfprintf_flt.o
Next by thread: [avr-libc-dev] Looking to contribute
Index(es):
- Date
- Thread