freetype-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Devel] Optimizatins to ttinterp.c


From: Werner LEMBERG
Subject: Re: [Devel] Optimizatins to ttinterp.c
Date: Sat, 09 Dec 2000 03:24:06 +0100 (CET)

Some comparison values from FT2 done with gprof.

> My test was to open "Arial Unicode MS" and TT_Load_Glyph all the
> glyphs so that I could call TT_Get_Glyph_Extents.  In doing so, I
> found that RunIns() got called 52,794 times for 13,494 ms. (This
> font has a lot of Kanji and Korean glyphs, many of which are
> composites).  I tried loading without hinting, but that made my
> extents inaccurate.

I used the following sample, compiled with gcc 2.95.2, using -O0 to
switch off optimization resp. -O3 to get maximum optimization (the
latter does inlining also for `simple functions', according to the gcc
info pages).

The used font is Arial Unicode MS version 0.84 which probably explains
the different number of glyphs.

======================================================================

#include <freetype/freetype.h>
#include <freetype/ftglyph.h>


int main(void)
{
  FT_Library library;
  FT_Face    face;
  FT_Glyph   glyph;
  FT_BBox    bbox, cbox;
  FT_UShort  i;


  (void)FT_Init_FreeType(&library);
  (void)FT_New_Face(library, "arialuni.ttf", 0, &face);
  (void)FT_Set_Char_Size(face, 0, 16 * 64, 100, 100);

  for (i = 0; i < 51180; i++)
  {
    (void)FT_Load_Glyph(face, i, FT_LOAD_DEFAULT);
    (void)FT_Get_Glyph(face->glyph, &glyph);
  }

  return 0;
}

======================================================================

> Of the functions that RunIns() calls, the major ones were...
>
>    function               percent      calls    propagated time
>     Calc_Length            17.57%    9,474,392    2,370 ms
>     Ins_SHP                14.08     1,014,967    1,899 ms
>     Ins_IUP                11.07       103,158    1,493 ms
>     Ins_IP                 10.04       521,928    1,355 ms

Here the results from gprof for the above test program:

index % time    self  children    called     name

[8]     67.4    5.83    7.59   65063         TT_RunIns [8]
                0.63    1.68  127692/127692      Ins_IUP [10]
                0.97    0.96 1413151/1413151     Ins_SHP [11]
                0.63    0.65  715371/715371      Ins_IP [14]
                0.48    0.26  722667/722667      Ins_MIRP [16]

Calc_Length() no longer exists; Ins_IUP needs 0.63 + 1.68s, so we have
(5.83+7.59)/(0.63+1.68) = 17.2% of RunIns() for this function, etc.

> The big gain came from TT_MulDiv().  It turns out that all the other
> time soaking functions end up calling it.

Interesting.  I get different results.  Here the first few entries
of the flat profile for -O0:


  %   cumulative   self              self     total
 time   seconds   seconds    calls  us/call  us/call  name
 12.43      6.05     6.05    65063    92.99   592.84  TT_RunIns
 10.75     11.28     5.23  4828603     1.08     1.08  Project_x
  9.47     15.89     4.61  4473580     1.03     1.03  Project_y
  4.46     18.06     2.17  2255569     0.96     0.96  Round_To_Grid
  4.25     20.13     2.07    64221    32.23    55.43  TT_Load_Simple_Glyph
  3.88     22.02     1.89  1413151     1.34     4.28  Ins_SHP
  3.55     23.75     1.73  1677664     1.03     1.03  Direct_Move_X
  3.37     25.39     1.64  1540309     1.06     1.06  Direct_Move_Y
  3.02     26.86     1.47  6152015     0.24     0.48  FT_MulDiv
  2.88     28.26     1.40   715371     1.96    11.02  Ins_IP
  2.73     29.59     1.33  3847019     0.35     0.56  Interp
  2.63     30.87     1.28  7862798     0.16     0.16  FT_MulFix
  2.30     31.99     1.12   722667     1.55     5.83  Ins_MIRP
  2.20     33.06     1.07  1416595     0.76     2.78  Compute_Point_Displacement
  2.03     34.05     0.99   812915     1.22     1.42  Ins_ENDF
  2.03     35.04     0.99   127692     7.75    24.65  Ins_IUP
  1.91     35.97     0.93  3369363     0.28     0.28  SkipCode
  1.89     36.89     0.92  7518177     0.12     0.12  FT_Get_Char
  1.62     37.68     0.79  1606929     0.49     0.49  FT_MulTo64
  1.56     38.44     0.76    64221    11.83   625.13  TT_Process_Simple_Glyph
  1.44     39.14     0.70  2317733     0.30     0.30  FT_Get_Short


And here the first few entries of the flat profile for -O3:


  %   cumulative   self              self     total
 time   seconds   seconds    calls  us/call  us/call  name
 29.28      5.83     5.83    65063    89.61   206.27  TT_RunIns
  6.58      7.14     1.31  6049655     0.22     0.31  FT_MulDiv
  6.23      8.38     1.24    64221    19.31    36.94  TT_Load_Simple_Glyph
  5.73      9.52     1.14  3847019     0.30     0.44  Interp
  5.32     10.58     1.06  7862794     0.13     0.13  FT_MulFix
  4.87     11.55     0.97  1413151     0.69     1.36  Ins_SHP
  3.62     12.27     0.72  7518177     0.10     0.10  FT_Get_Char
  3.16     12.90     0.63   715371     0.88     1.79  Ins_IP
  3.16     13.53     0.63   127692     4.93    18.07  Ins_IUP
  2.66     14.06     0.53    64221     8.25   230.91  TT_Process_Simple_Glyph
  2.41     14.54     0.48   722667     0.66     1.03  Ins_MIRP
  2.36     15.01     0.47  2317733     0.20     0.20  FT_Get_Short
  1.81     15.37     0.36    51180     7.03   356.17  load_truetype_glyph
  1.61     15.69     0.32  4828603     0.07     0.07  Project_x
  1.51     15.99     0.30  1531719     0.20     0.20  FT_Div64by32
  1.36     16.26     0.27   812862     0.33     0.33  Ins_CALL
  1.16     16.49     0.23  1563135     0.15     0.15  FT_MulTo64
  1.16     16.72     0.23    51180     4.49    12.04  compute_glyph_metrics
  1.05     16.93     0.21    51180     4.10     4.10  FT_Outline_Get_CBox
  1.00     17.13     0.20  1677664     0.12     0.12  Direct_Move_X
  1.00     17.33     0.20    51182     3.91     3.91  TT_Load_Context

The cumulated execution time with -O0 is about 49 seconds; FT_MulDiv()
uses about 2.6%.

The cumulated execution time with -O3 is about 20 seconds; FT_MulDiv()
uses about 6.5%.


I haven't actually tried gcc's `inline' option.


    Werner



reply via email to

[Prev in Thread] Current Thread [Next in Thread]