lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Micro-optimization in ledger_format


From: Greg Chicares
Subject: Re: [lmi] Micro-optimization in ledger_format
Date: Fri, 18 Jan 2019 14:15:41 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0

On 2019-01-16 14:08, Vadim Zeitlin wrote:
> On Wed, 16 Jan 2019 12:17:49 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> On 2018-11-26 15:53, Vadim Zeitlin wrote:
> GC> > 
> GC> >  One of results of profiling the PDF generation is that applying the 
> diff
> GC> > from https://github.com/vadz/lmi/pull/104/files reduces the total time 
> by
> GC> > more than 10% (~14% in our tests), i.e. creating a new stream object and
> GC> > imbuing it with a custom locale every time the function is called is 
> very
> GC> > expensive.
> GC> 
> GC> I was distracted by an "emergency" in early December, but would like to
> GC> return to this now before it gets too stale.
> 
>  Thanks for getting back to it!

Done now:
    commit 33c0f82b596fa610528f175b3eaaef97900798b9
    Speed up ledger_format() [Ilja]
Running an msw binary under 'wine', I measure the time to
generate a PDF for a default "File | New | Illustration | OK"
illustration as:
  580 msec before 33c0f82
  515  "   after   "
which is
  (1/515) / (1/580) - 1
one-eighth faster. That 12.5% is expected to be less than the
14% in your tests because of the added call to str() to reset
the stringstream's contents prophylactically.

Corresponding changes to stream_cast<>() result in these
before-and-after timings in the unit test, where only the
first line is affected by commit 12cb7b91d:

  stream_cast     : 3.279e-003 s mean;       3203 us least of 100 runs
  minimalistic    : 2.585e-003 s mean;       2571 us least of 100 runs
  static stream   : 1.299e-003 s mean;       1159 us least of 100 runs
  static facet too: 8.441e-004 s mean;        841 us least of 100 runs
  same, but IIFE  : 8.469e-004 s mean;        840 us least of 100 runs

  stream_cast     : 1.915e-003 s mean;       1861 us least of 100 runs
  minimalistic    : 2.583e-003 s mean;       2568 us least of 100 runs
  static stream   : 1.528e-003 s mean;       1164 us least of 100 runs
  static facet too: 8.555e-004 s mean;        844 us least of 100 runs
  same, but IIFE  : 8.531e-004 s mean;        843 us least of 100 runs

Why is that latest version only half as fast as the streamlined
unit test (1861 vs 843 us)? Both have the same improvement; the
only difference is the fancy run-time error reporting in
stream_cast<>().

Next I plan to remove all the experimental unit-test code
(everything except the first timing line) because in retrospect
the optimized version seems obviously good.

Now, as the January code-freeze date draws near, we have two
different release candidates to choose from: one with only the
MDB DBO suppressed (which we strongly prefer to release this
month), and another that adds this speedup (which, while most
desirable, is not as crucial to release immediately); we'll
choose between them on the basis of the time available for
testing. We'd release the latest without giving it much thought
if we had a battery of automated PDF regression tests (which
would be helpful in many other ways, too), but that new topic
deserves a thread of its own.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]