freetype-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ft-devel] (no subject)


From: David Turner
Subject: [ft-devel] (no subject)
Date: Mon, 8 Jul 2013 18:07:31 -0700

Hello,

Here's a small set of patches that slightly improve the performance of FreeType when compiled for ARM and x86_64 with GCC. I also checked that it doesn't negatively affect x86 performance.

On ARM, loading glyphs is about 3% faster, and rendering gray bitmaps is 6% faster.
On x86_64, loading glyphs is 6% faster, and rendering gray bitmaps is 2.5% faster.

The optimizations were found by inspecting the generated machine code in hot spots.

Let me know if you find any issue.

- David

PS: Everything measured with:

./ftbench -p -t 5 -s 14 -f 0008 Arial.ttf

(0008 is FT_LOAD_NO_BITMAP)

ARM:
====

CFLAGS="-O2 -fomit-frame-pointer -march=armv7-a -mthumb" ./configure --disable-shared --without-zlib --without-png --without-bzip2 --host=arm-linux-androideabi

Before:
  Load                      34.287 us/op
  Load_Advances (Normal)    34.317 us/op
  Load_Advances (Fast)      0.176 us/op
  Render                    23.544 us/op
  Get_Glyph                 6.661 us/op
  Get_CBox                  1.957 us/op
  Get_Char_Index            0.261 us/op
  Iterate CMap              121.696 us/op
  New_Face                  115.143 us/op
  Embolden                  1.428 us/op
  Get_BBox                  3.313 us/op

After:
  Load                      33.358 us/op
  Load_Advances (Normal)    33.330 us/op
  Load_Advances (Fast)      0.176 us/op
  Render                    22.079 us/op
  Get_Glyph                 6.494 us/op
  Get_CBox                  1.937 us/op
  Get_Char_Index            0.232 us/op
  Iterate CMap              120.793 us/op
  New_Face                  115.759 us/op
  Embolden                  1.450 us/op
  Get_BBox                  3.384 us/op

x86_64:
=======

CFLAGS="-O2 -fomit-frame-pointer" ./configure --disable-shared --without-zlib --without-png --without-bzip2

Before:
  Load                      4.890 us/op
  Load_Advances (Normal)    4.849 us/op
  Load_Advances (Fast)      0.027 us/op
  Render                    2.813 us/op
  Get_Glyph                 0.473 us/op
  Get_CBox                  0.076 us/op
  Get_Char_Index            0.024 us/op
  Iterate CMap              13.982 us/op
  New_Face                  12.341 us/op
  Embolden                  0.027 us/op
  Get_BBox                  0.303 us/op

After:
  Load                      4.617 us/op
  Load_Advances (Normal)    4.537 us/op
  Load_Advances (Fast)      0.028 us/op
  Render                    2.743 us/op
  Get_Glyph                 0.441 us/op
  Get_CBox                  0.076 us/op
  Get_Char_Index            0.023 us/op
  Iterate CMap              13.508 us/op
  New_Face                  12.298 us/op
  Embolden                  0.027 us/op
  Get_BBox                  0.296 us/op

x86:
====

CFLAGS="-O2 -fomit-frame-pointer -m32" LDFLAGS="-m32" ./configure --disable-shared --without-zlib --without-png --without-bzip2

Before:
  Load                      4.973 us/op
  Load_Advances (Normal)    4.910 us/op
  Load_Advances (Fast)      0.023 us/op
  Render                    3.140 us/op
  Get_Glyph                 0.641 us/op
  Get_CBox                  0.243 us/op
  Get_Char_Index            0.027 us/op
  Iterate CMap              15.303 us/op
  New_Face                  13.041 us/op
  Embolden                  0.167 us/op
  Get_BBox                  0.527 us/op

After:
  Load                      4.930 us/op
  Load_Advances (Normal)    4.895 us/op
  Load_Advances (Fast)      0.023 us/op
  Render                    3.131 us/op
  Get_Glyph                 0.620 us/op
  Get_CBox                  0.237 us/op
  Get_Char_Index            0.027 us/op
  Iterate CMap              15.051 us/op
  New_Face                  13.133 us/op
  Embolden                  0.163 us/op
  Get_BBox                  0.524 us/op

Attachment: 0001-arm-Enable-FT_MulFix_arm-for-thumb2-compilation.patch
Description: Binary data

Attachment: 0002-x86_64-Optimize-FT_MulFix-for-x86_64-GCC-builds.patch
Description: Binary data

Attachment: 0003-arm-x86-x86_64-Optimized-TT_MulFix14-TT_DivFix14.patch
Description: Binary data

Attachment: 0004-arm-Improve-gray-rasterizer-performance.patch
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]