Re: [ft-devel] FT_MulFix assembly

freetype-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ft-devel] FT_MulFix assembly

From:	James Cloos
Subject:	Re: [ft-devel] FT_MulFix assembly
Date:	Sat, 07 Aug 2010 12:36:27 -0400
User-agent:	Gnus/5.110011 (No Gnus v0.11) Emacs/24.0.50 (gnu/linux)

My first cut at FT_MulFix_x86_64() is:

static __inline__ FT_Int32
FT_MulFix_x86_64 (FT_Int32 a, FT_Int32 b) {
    register FT_Int32 r;
    __asm__ __volatile__ (
        "movslq %%edx, %%rdx\n"
        "cltq\n"
        "imul  %%rdx\n"
        "addq  %%rdx, %%rax\n"
        "addq  $0x8000, %%rax\n"
        "sarq  $16, %%rax\n"
        : "=a"(r)
        : "a"(a), "d"(b));
    return r;
}

It passes a monte-carlo test comparing its results to the C code and to
the i386 assembly.

The logic is simple.  The first two instructions sign-extend the two
values to 64 bits, the multiply puts the least significant 64 bits of
the product in rax and the most significant bits in rdx; because the
values started out as 32 bit, rdx is guaranteed to be only sign bits:
zero if the product is >=0, else -1.  Adding the resulting rdx to rax
serves the same purpose as the ecx value in the i386 version: it makes
the rounding symmetric around zero, just like the C code.

An alternative might be to cast the src values to (FT_Int64), but I
doubt that the compiler would generate any better code than calling
movslq and cltq.  

I have to finish the patch, but I thought I'd offer the algorithm for
review, if anyone wants to.

-JimC
-- 
James Cloos <address@hidden>         OpenPGP: 1024D/ED7DAEA6

[Prev in Thread]

Current Thread

[Next in Thread]

[ft-devel] FT_MulFix assembly, James Cloos, 2010/08/06
- Re: [ft-devel] FT_MulFix assembly, Werner LEMBERG, 2010/08/06
- Re: [ft-devel] FT_MulFix assembly, James Cloos <=
  - Re: [ft-devel] FT_MulFix assembly, Werner LEMBERG, 2010/08/12

Prev by Date: [ft-devel] FreeType 2.4.2 has been released
Next by Date: Re: [ft-devel] FT_MulFix assembly
Previous by thread: Re: [ft-devel] FT_MulFix assembly
Next by thread: Re: [ft-devel] FT_MulFix assembly
Index(es):
- Date
- Thread