Re: OMG - performance issue with Scheme engraver

lilypond-user

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: OMG - performance issue with Scheme engraver

From:	Urs Liska
Subject:	Re: OMG - performance issue with Scheme engraver
Date:	Tue, 10 Jul 2018 16:53:20 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0

Am 10.07.2018 um 12:22 schrieb David Kastrup:

Urs Liska <address@hidden> writes:

after completing most of the work of reviewing the scholarly.annotate
module I realize that it (presumably one small change) is a total
performance killer, and I need some help tracking it down.

Quadratic behavior or worse.

That was my suspicion too, but I'm not really sure about that.

Observations:

 * The obvious stage of slow-down is when the log prints Interpreting
   music...[8][16][24] etc.

Does it get slower over time?

Yes, it seems. In a longer piece with 300 measures the period for one eight-measure group *tended* to become longer (from 1-4 seconds to 7-10 seconds, but there also were 3-seconds groups towards the end). And while the process constantly consumes 100% of one CPU's time memory kept surprisingly low at about 300MB.
But the three-movment piece that took 14.7 seconds without the engraver is still running ATM after 6 minutes and just started the last movement ...
Well, by now I'm at ~20 seconds per interpreting 8 measures ...

[Edit:]
By now the compilation has finished: 20 pages in 12 minutes 28 versus 14,7 seconds without the engraver !
From the moment the log said "Preprocessing graphical objects" it only took the remaining 5 or so seconds.

Depending on the number of contexts I have consisted the engraver to
the first "Finalize" may be printed *before* the [8], so I'm not
really sure how reliable the order of log output really is here. I'm
also wondering why "process-acknowledged" is printed before
"acknowledger" ...

process-acknowledged is called for any grob/spanner, not just the ones
your engraver is listening to.

OK, but my acknowledger is listening to grob-interface anyway.

Maybe I'm still confused about the order of operations.

And currently it is called far more
often than I consider nice or desirable (in particular, it is called in
lower contexts before the grob may even have been acknowledged in higher
contexts) but my attempts to fix that have so far crashed in the
vertical alignment code.

So currently you just have the guarantee that process-acknowledged will
run after a grob has been announced in all contexts, but it may run
before that as well.

How could I proceed to further track down the issue? But maybe my own
suspicions already help bringing someone on the right track?

Quadratic behavior or worse.

I see two changes to the previous code that I think could have such
impact, one more and one less.

2) The engraver works upon grobs that have a certain grob-property
attached (an 'input-annotation alist that is attached in the
music-function).

In the earlier code that annotation was attached through \once
\override, the new code uses (propertyTweak).

In the earlier code I did that test for the property in the
(acknowledgers) clause and simply skipped all irrelevant grobs. All
grobs with an annotation were collected in a list and that list later
processed in (finalize).

In a recent discussion
(http://lists.gnu.org/archive/html/lilypond-user/2018-06/msg00218.html)
I was directed to move that test to (process-acknowledged) because a)
the tweaks weren't available as the grob property earlier and b) this
was the place to do that.

In order to achieve that I had to use (acknowledgers) to add *all*
grobs to a list and iterate over that list in (process-acknowledged).

What does "add" mean?

annotationCollector =
#(let*
  ((annotated-grobs '())
   (all-grobs '()))
  (lambda (context)
    (make-engraver
     (acknowledgers
      ((grob-interface engraver grob source-engraver)
       ;; 'input-annotation property is only visible in process-acknowledged,
       ;; therefore we have to record *all* grobs here
       (set! all-grobs (cons grob all-grobs))
       ))

As I've understood the recent discussion I thought I had to to do that to have all the grobs accessible in the subsequent (process-acknowledged). I hope I was wrong with that.

I have the impression that *this* is the performance problem: creating
a list with references to *all* grobs in the score, regardless of
whether they are needed at all. Which would explain why the problem
exists even if not a single annotation is present in the score.

Creating such a list is not a performance but a memory problem unless
you do something stupid during the creation that will access the whole
list every time you add to it.

Like making sure you don't add the same element multiple times.  If you
want to do that, do it in a private list in process-acknowledged (which
is called too often) and then move the cleaned list to the front(!) of
your big list at stop-translation-timestep.  That limits potentially
quadratic behavior (or worse) to small lists.

Well, maybe I have to ask again how I get grobs "into" process-acknowledged.
What I need is check each grob *once* whether it has a certain grob property, and I have to do that in process-acknowledged because that property isn't reliably available earlier.

What I *understood* what I have to do is create a list in acknowledgers and iterate over that in process-acknowledged (pseudo-code):

((process-acknowledged translator)
 (for-each
  (lambda (
    (if (need-to-process grob)
        (let
         ((annotation (process-annotation (ly:grob-property grob 'input-annotation))))
         (set! annotated-grobs (cons grob annotated-grobs))
         (set! all-grobs '()))))
    all-grobs))

Go through all the grobs collected in the first step, if necessary process the annotation and prepend it to the all-grobs list and set all-grobs to an empty list so the grobs aren't processed multiple times.
f
Further investigation shows the following:

if I skip the (set! all-grobs line then not much happens, there will be an empty all-grobs list and process-acknowledged iterates over that empty list. The processing goes as fast as without the engraver.
If I replace the (need-to-process) call with #f (suppressing the body of the annotation processing) processing is fast, so it is obvious that the all-grobs list isn't the culprit.
However, if I *do* the "need-to-process" test but suppress the body of that _expression_ (the actual annotation handling) it's slow again - so I have reason to assume that it is this test that takes so much time.

The actual code in that test is:

     ((process-acknowledged translator)
      (for-each
       (lambda (grob)
         (let*
          ((annotation (ly:grob-property grob 'input-annotation))
           (ann-type (assq-ref annotation 'ann-type))
           (is-annotation
            ;; A grob is to be accepted when 'annotation *does* have some content
            ;; and is not marked as ignored.
            (and ann-type
                 (not (member ann-type ignored-types))))) ; ignored-types is a symbol-list
          (if is-annotation
           ...

First a property 'input-annotation is read from the grob. Then a key 'ann-type is retrieved from the annotation (or the empty list). A grob has to be processed when a) an 'ann-type is present and b) this isn't member of an exclusion list (of <= 5 elements).

My checks seem to indicate that it is *this* sequence that is slowing down things so much. Is that even possible??? I don't know how expensive ly:grob-property is, but I of course used this too in the previous version.

However, one more test found the source of my problem, but unfortunately I still don't know what I have to do now: I used a counter to check how often the different parts are executed.

For a short 1-page score:

acknowledgers:            980
process-acknowledged:     711
lambda in process-ack: 330421 ratio: 1 / 464 (ratio between calls to process-acknowledged to the inner for-each loop)

For the mentioned 3-page score:

acknowledgers:            5037
process-acknowledged:     2700
lambda in process-ack: 6726982 ratio: 1 / 2.491

For the 21-page three-movement score

acknowledgers:             21518      30695     46492
process-acknowledged:      17656      25520     35183
lambda in process-ack: 187253429  363211651 779357617    ratios: 1 / 10.605, 1 / 14.232, 1 / 22.151

So obviously the bottleneck is created because the all-grobs list is too long and processed too often.

Which gets back to what I said above: what I need is a way to check every grob once (ideally) to test if it is an annotation to be processed. Obviously I *still* need more clarification about what happens exactly when.

But I'll continue the comment in a reply to your other post, David.

Urs

[Prev in Thread]

Current Thread

[Next in Thread]

OMG - performance issue with Scheme engraver, Urs Liska, 2018/07/10
- Re: OMG - performance issue with Scheme engraver, David Kastrup, 2018/07/10
  - Re: OMG - performance issue with Scheme engraver, Aaron Hill, 2018/07/10
  - Re: OMG - performance issue with Scheme engraver, Urs Liska <=
- Re: OMG - performance issue with Scheme engraver, David Kastrup, 2018/07/10
  - Re: OMG - performance issue with Scheme engraver, Urs Liska, 2018/07/10
    - Re: OMG - performance issue with Scheme engraver, Urs Liska, 2018/07/10
    - Re: OMG - performance issue with Scheme engraver, Urs Liska, 2018/07/11
    - Re: OMG - performance issue with Scheme engraver, David Kastrup, 2018/07/11
    - Re: OMG - performance issue with Scheme engraver, Urs Liska, 2018/07/11
    - Re: OMG - performance issue with Scheme engraver, David Kastrup, 2018/07/11

Prev by Date: Re: OMG - performance issue with Scheme engraver
Next by Date: Re: OMG - performance issue with Scheme engraver
Previous by thread: Re: OMG - performance issue with Scheme engraver
Next by thread: Re: OMG - performance issue with Scheme engraver
Index(es):
- Date
- Thread