|
From: | Urs Liska |
Subject: | Re: OMG - performance issue with Scheme engraver |
Date: | Tue, 10 Jul 2018 16:53:20 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 |
Am 10.07.2018 um 12:22 schrieb David
Kastrup:
Urs Liska <address@hidden> writes:after completing most of the work of reviewing the scholarly.annotate module I realize that it (presumably one small change) is a total performance killer, and I need some help tracking it down.Quadratic behavior or worse. That was my suspicion too, but I'm not really sure about that. Observations: * The obvious stage of slow-down is when the log prints Interpreting music...[8][16][24] etc.Does it get slower over time? Yes, it seems. In a longer piece with 300 measures the period for one eight-measure group *tended* to become longer (from 1-4 seconds to 7-10 seconds, but there also were 3-seconds groups towards the end). And while the process constantly consumes 100% of one CPU's time memory kept surprisingly low at about 300MB. But the three-movment piece that took 14.7 seconds without the engraver is still running ATM after 6 minutes and just started the last movement ... Well, by now I'm at ~20 seconds per interpreting 8 measures ... [Edit:] By now the compilation has finished: 20 pages in 12 minutes 28 versus 14,7 seconds without the engraver ! From the moment the log said "Preprocessing graphical objects" it only took the remaining 5 or so seconds. Depending on the number of contexts I have consisted the engraver to the first "Finalize" may be printed *before* the [8], so I'm not really sure how reliable the order of log output really is here. I'm also wondering why "process-acknowledged" is printed before "acknowledger" ...process-acknowledged is called for any grob/spanner, not just the ones your engraver is listening to. OK, but my acknowledger is listening to grob-interface anyway. Maybe I'm still confused about the order of operations. And currently it is called far more often than I consider nice or desirable (in particular, it is called in lower contexts before the grob may even have been acknowledged in higher contexts) but my attempts to fix that have so far crashed in the vertical alignment code. So currently you just have the guarantee that process-acknowledged will run after a grob has been announced in all contexts, but it may run before that as well.How could I proceed to further track down the issue? But maybe my own suspicions already help bringing someone on the right track?Quadratic behavior or worse.I see two changes to the previous code that I think could have such impact, one more and one less.2) The engraver works upon grobs that have a certain grob-property attached (an 'input-annotation alist that is attached in the music-function). In the earlier code that annotation was attached through \once \override, the new code uses (propertyTweak). In the earlier code I did that test for the property in the (acknowledgers) clause and simply skipped all irrelevant grobs. All grobs with an annotation were collected in a list and that list later processed in (finalize). In a recent discussion (http://lists.gnu.org/archive/html/lilypond-user/2018-06/msg00218.html) I was directed to move that test to (process-acknowledged) because a) the tweaks weren't available as the grob property earlier and b) this was the place to do that. In order to achieve that I had to use (acknowledgers) to add *all* grobs to a list and iterate over that list in (process-acknowledged).What does "add" mean? annotationCollector = #(let* ((annotated-grobs '()) (all-grobs '())) (lambda (context) (make-engraver (acknowledgers ((grob-interface engraver grob source-engraver) ;; 'input-annotation property is only visible in process-acknowledged, ;; therefore we have to record *all* grobs here (set! all-grobs (cons grob all-grobs)) ))As I've understood the recent discussion I thought I had to to do that to have all the grobs accessible in the subsequent (process-acknowledged). I hope I was wrong with that. I have the impression that *this* is the performance problem: creating a list with references to *all* grobs in the score, regardless of whether they are needed at all. Which would explain why the problem exists even if not a single annotation is present in the score.Creating such a list is not a performance but a memory problem unless you do something stupid during the creation that will access the whole list every time you add to it. Like making sure you don't add the same element multiple times. If you want to do that, do it in a private list in process-acknowledged (which is called too often) and then move the cleaned list to the front(!) of your big list at stop-translation-timestep. That limits potentially quadratic behavior (or worse) to small lists. Well, maybe I have to ask again how I get grobs "into" process-acknowledged. What I need is check each grob *once* whether it has a certain grob property, and I have to do that in process-acknowledged because that property isn't reliably available earlier. What I *understood* what I have to do is create a list in acknowledgers and iterate over that in process-acknowledged (pseudo-code): ((process-acknowledged translator) (for-each (lambda ( (if (need-to-process grob) (let ((annotation (process-annotation (ly:grob-property grob 'input-annotation)))) (set! annotated-grobs (cons grob annotated-grobs)) (set! all-grobs '())))) all-grobs))Go through all the grobs collected in the first step, if necessary process the annotation and prepend it to the all-grobs list and set all-grobs to an empty list so the grobs aren't processed multiple times. f Further investigation shows the following:
The actual code in that test is: ((process-acknowledged translator) (for-each (lambda (grob) (let* ((annotation (ly:grob-property grob 'input-annotation)) (ann-type (assq-ref annotation 'ann-type)) (is-annotation ;; A grob is to be accepted when 'annotation *does* have some content ;; and is not marked as ignored. (and ann-type (not (member ann-type ignored-types))))) ; ignored-types is a symbol-list (if is-annotation ... First a property 'input-annotation is read from the grob. Then a key 'ann-type is retrieved from the annotation (or the empty list). A grob has to be processed when a) an 'ann-type is present and b) this isn't member of an exclusion list (of <= 5 elements). My checks seem to indicate that it is *this* sequence that is slowing down things so much. Is that even possible??? I don't know how expensive ly:grob-property is, but I of course used this too in the previous version. However, one more test found the source of my problem, but unfortunately I still don't know what I have to do now: I used a counter to check how often the different parts are executed. For a short 1-page score: acknowledgers: 980 process-acknowledged: 711 lambda in process-ack: 330421 ratio: 1 / 464 (ratio between calls to process-acknowledged to the inner for-each loop) For the mentioned 3-page score: acknowledgers: 5037 process-acknowledged: 2700 lambda in process-ack: 6726982 ratio: 1 / 2.491For the 21-page three-movement score acknowledgers: 21518 30695 46492 process-acknowledged: 17656 25520 35183 lambda in process-ack: 187253429 363211651 779357617 ratios: 1 / 10.605, 1 / 14.232, 1 / 22.151 So obviously the bottleneck is created because the all-grobs list is too long and processed too often. Which gets back to what I said above: what I need is a way to check every grob once (ideally) to test if it is an annotation to be processed. Obviously I *still* need more clarification about what happens exactly when. But I'll continue the comment in a reply to your other post, David. Urs |
[Prev in Thread] | Current Thread | [Next in Thread] |