[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Backend and non-backend (was Re: Stencil bounding box)
From: |
Han-Wen Nienhuys |
Subject: |
Re: Backend and non-backend (was Re: Stencil bounding box) |
Date: |
Mon, 01 May 2006 01:31:59 +0200 |
User-agent: |
Thunderbird 1.5 (X11/20060313) |
David Feuer wrote:
On 4/27/06, Han-Wen Nienhuys <address@hidden> wrote:
Frankly, I'm a bit mystified why you're spending so much time on
building the ultimate postscript backend. The backend is not a
performance bottleneck. If you think the current PS code is inefficient,
then you should have a look at the rest of LilyPond.
Could you suggest something else to work on that would be reasonably
easy to understand?
there's one idea that I've been toying in my head with for a bit.
One of the major problems is that we don't have a reliable way to test
LilyPond automatically: the output is a PDF file, and changes in
formatting may subtly alter PDF files between versions, which throws off
a diff or cmp with a reference file.
However, I think it should be possible to generate a "signature" of the
output from within the backend, and use that to detect changes in
formatting. The idea is that the output is characterized by the set of
grobs used for the output.
Each grob is characterized by :
* its name
* its bounding box
* its stencil output expression
The output expression contains a lot of floating point numbers, which
are liable to change subtly across versions. However, the non-number
parts of the expressions (symbols and strings) themselves are likely to
stay constant.
Similarly, the exact dimensions of the bboxes will vary, but in absence
of formatting errors, the area of a bbox should remain more or less
constant.
Now, my idea was to look dump a list of
signature = (name, stripped expression, bbox)
entries from the backend. Then we should have a separate program which
computes
d(signature_1, signature_2)
If we have this, we could dump all signatures for generating the
regression test, and compare with an existing set of signatures. We can
then more easily spot regression bugs as soon as we create them: a
change in formatting will cause a large distance between the two signatures.
I was thinking that you could use the area of bbox overlaps (ie. for
each bbox B_i, the quantity
sum_j area (B_i intersect B_j)
as a mostly-invariant measure. I'm not sure though, and it might be
useful to do some experimentation; perhaps you could also do use a
simpler distance measure for R^2 subsets.
You could get a distance by sorting each entry in sig_1 and sig_2 by
(name, stripped expression, bbox overlap area) and then using a standard
R^n norm on the bbox overlap area.
Dumping the signature is a bit of Scheme, and then the computations can
be done outside of lilypond, eg. in Python.
Does that sound interesting?
--
Han-Wen Nienhuys - address@hidden - http://www.xs4all.nl/~hanwen
LilyPond Software Design
-- Code for Music Notation
http://www.lilypond-design.com