freetype-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[freetype2] multiply-shift 2f37a71: [smooth] Reduce shift in multiply-sh


From: Werner Lemberg
Subject: [freetype2] multiply-shift 2f37a71: [smooth] Reduce shift in multiply-shift optimization.
Date: Thu, 26 Aug 2021 12:50:05 -0400 (EDT)

branch: multiply-shift
commit 2f37a713cdc2c904f86ea366d933650bdeebda6f
Author: Alexei Podtelezhnikov <apodtele@gmail.com>
Commit: Alexei Podtelezhnikov <apodtele@gmail.com>

    [smooth] Reduce shift in multiply-shift optimization.
    
    Smaller shifts that keep the division operands of FT_UDIVPREP within
    32 bits result in slightly faster divisions, which is noticeable in
    the overall performance.  The loss of precision is tolerable until the
    divisors (the components dx and dy) approach 32 - PIXEL_BITS. With
    PIXEL_BITS = 8, this corresponds to 65,000 pixels or the bitmap size
    that we refuse to render anyway.
    
    Using `ftbench -p -s60 -t5 -bc timesi.ttf`,
    
    Before: 8.52 us/op
    After:  8.32 us/op
---
 src/smooth/ftgrays.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/src/smooth/ftgrays.c b/src/smooth/ftgrays.c
index c550c33..4cc5e21 100644
--- a/src/smooth/ftgrays.c
+++ b/src/smooth/ftgrays.c
@@ -386,12 +386,11 @@ typedef ptrdiff_t  FT_PtrDist;
   /* divisors to provide sufficient accuracy of the multiply-shift.    */
   /* It should not exceed (64 - PIXEL_BITS) to prevent overflowing and */
   /* leave enough room for 64-bit unsigned multiplication however.     */
-#define FT_UDIVPREP( c, b )                                                 \
-  FT_Int64  b ## _r = c ? (FT_Int64)( ~(FT_UInt64)0 >> PIXEL_BITS ) / ( b ) \
+#define FT_UDIVPREP( c, b )                            \
+  FT_Int64  b ## _r = c ? (FT_Int64)0xFFFFFFFF / ( b ) \
                     : 0
-#define FT_UDIV( a, b )                                         \
-  (TCoord)( ( (FT_UInt64)( a ) * (FT_UInt64)( b ## _r ) ) >>    \
-            ( sizeof( FT_UInt64 ) * FT_CHAR_BIT - PIXEL_BITS ) )
+#define FT_UDIV( a, b )                                           \
+  (TCoord)( ( (FT_UInt64)( a ) * (FT_UInt64)( b ## _r ) ) >> 32 )
 
 
   /* Scale area and apply fill rule to calculate the coverage byte. */



reply via email to

[Prev in Thread] Current Thread [Next in Thread]