Might be an idea to look at Antlr then ...
I don't know how effective it would be, but part of the purpose behind
the v3 rewrite is to increase the number of languages that Antlr can
generate. If you can define the grammar in Antlr it could then chuck out
a lexer/parser written in Scheme :-) Is the current lily parser
hand-written? At the expense of some upfront work, this could save a lot
of effort downstream if so.
It does sound very much as though what you want to do, is what Antlr is
designed to do ... you say Scheme rocks for tree manipulation? Well
Antlr was DESIGNED for tree manipulation (and other related tasks :-)
...
And yes, preservation of white space is apparently fairly easy.
Although, I'm half inclined to say that if you "convert-ly" a piece,
then it will or can output stuff according to a "pretty print" standard.
Take a look at Antlr - www.antlr.org. I'd go straight to v3, which is
due for release this summer. What you would need to do is define the
lily grammar using a BNF style notation (I think the technical term for
the style of grammar is LA or LLA). Antlr itself is written in Java -
run it over your grammar and it will spit out a lexer and parser for
you. You could either use the current C++ templates which would create
your lexer and parser in C++ for feeding the results to Scheme, or write
a bunch of Scheme templates and output the lexer and parser directly in
Scheme. I don't mean to teach grandma to suck eggs but it sounds (from
the implication that the parser is mixed-language) that you're not using
a compiler-compiler such as flex/Bison, PCCTS or Antlr.
And of course, going down this route, things head towards the typical
nix "toolkit" approach :-) Grammars to input lily ASTs in various
versions, grammars to output "pretty printed" or converted .ly files in
various versions, grammars to transform ASTs especially ones generated
from eg abc, Sibelius, Personal Composer files :-) Just link the
appropriate sequence of tools and away you go ...
I can see it being a lot of up-front work. The question is, will it save
more than that in future ... and I think your answer to "is it possible"
is "yes", just is it worth it?
Cheers,
Wol
-----Original Message-----
From: Erik Sandberg [mailto:address@hidden
Sent: 12 July 2006 19:08
To: address@hidden
Cc: Anthony Youngman
Subject: Re: Evolutionary User Strategy - A Compromise
On Wednesday 12 July 2006 17:22, Anthony Youngman wrote:
I don't really understand grammars etc (which is why my DATABASIC
thing
is on/off :-).
But from my experience with Antlr, I don't see why you should lose
stuff. Your PEG article mentions ASTs. I don't see that converting a
.ly
file into an AST can be that hard. So, for example, we write a Antlr
grammar that creates a lexer/parser that turns the .ly into an AST. We
now write another grammar that converts the AST to a .ly file.
A problem here is code duplication; it takes some effort to maintain two
parsers instead of one. I think it will be difficult to automatically
test
that the current antlr parser corresponds well with the actual grammar
the
current lilypond version uses.
I have been thinking about moving lily's entire parser out to Scheme;
this way
we could keep one old parser for each version, and use it to generate an
AST,
which then is converted nicely using rules written in Scheme (Scheme
rocks
when it comes to tree manipulation). I'm not sure if it's possible
though.
BTW: Will your solution handle whitespace nicely?