avr-libc-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-libc-dev] [untested PATCH] Save 11 instructions in vfprintf_flt


From: George Spelvin
Subject: Re: [avr-libc-dev] [untested PATCH] Save 11 instructions in vfprintf_flt.o
Date: 8 Dec 2016 13:04:47 -0500

>> + * Unlike "if (src & smask) dst |= dmask", which is also two instructions

> This is confusing because the BST + BLD code below is not a replacement
> for what the C code is indicating.  For example the C code never clears
> the bit as opposed to BLD.

>> + * and two cycles, this overwrites the destination bit (clearing it
>> + * if necessary), and has fewer constraints; it can operate on the low
>> + * 16 registers.

That's *exactly* what I was trying to say in the rest of the sentence you
quoted back to me!  Perhaps I should just give the equivalent C code.

> + */
> +#define COPYBIT(dst, dmask, src, smask)      \
> +    asm(     "bst %2,%3"             \
> +     "\n     bld %0,%1"              \
> +     : "=r" (dst)                    \

> This is wrong because the old value of dst does not die here:
> all bits except %1 are surviving. Correct constraint is "+r".

Yes, I notice that myself a few minutes after posting.  It didn't
seem earth-shaking enough to warrant a followup.

It's now:

/*
 * Copy bit (src & smask) to (dst & dmask).  This expands to a pair of
 * bst/bld instructions which transfer the bit via the T register.
 *
 * Equivalent to "dst &= ~dmask; if (src & smask) dst |= dmask;", but is
 * only two instructions and has fewer constraints; it can operate on
 * the low 16 registers.
 */
#ifdef __BUILTIN_AVR_INSERT_BITS
/*
 * Using a GCC builtin is preferable; it gives the optimizer more
 * information and saves a few more bytes.
 *
 * The first argument to __builtin_avr_insert_bits is an array of 8
 * nibbles, each of which indicates the source of the corresponding
 * bit of the resilt.  All are 0xf (return the corresponding dst bit
 * unchanged) except the one corresponding to dmask, which specifies
 * the bit position in src to copy from.
 */
#define COPYBIT(dst, dmask, src, smask) \
    ((dst) = __builtin_avr_insert_bits( \
        ~((15ul-ntz(smask))*(dmask)*(dmask)*(dmask)*(dmask)), \
        src, dst))
#else
#define COPYBIT(dst, dmask, src, smask) \
    asm(        "bst %2,%3"             \
        "\n     bld %0,%1"              \
        : "+r" (dst)                    \
        : "n" (ntz(dmask)),             \
          "r" (src),                    \
          "n" (ntz(smask)))
#endif




reply via email to

[Prev in Thread] Current Thread [Next in Thread]