guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

More PEG


From: Noah Lavine
Subject: More PEG
Date: Thu, 8 Sep 2011 22:38:42 -0400

Hello all,

It looks to me like the last thing needed before the peg branch can be
used is to change some of the S-expression representations of the
components. Here are the five that I think need changing, taken from
the manual, with suggested replacements.

 -- PEG Pattern: zero or more a
     Parses A as many times in a row as it can, starting each A at the
     end of the text parsed by the previous A.  Always succeeds.

     `"a*"'

     `(body lit a *)'  change to: (* a)

 -- PEG Pattern: one or more a
     Parses A as many times in a row as it can, starting each A at the
     end of the text parsed by the previous A.  Succeeds if at least
     one A was parsed.

     `"a+"'

     `(body lit a +)'  change to: (+ a)

 -- PEG Pattern: optional a
     Tries to parse A.  Succeeds if A succeeds.

     `"a?"'

     `(? a)'

     Old: `(body lit a ?)'  change to: (? a)

 -- PEG Pattern: and predicate a
     Makes sure it is possible to parse A, but does not actually parse
     it.  Succeeds if A would succeed.

     `"&a"'

     `(body & a 1)'  change to: `(followed-by a)'

 -- PEG Pattern: not predicate a
     Makes sure it is impossible to parse A, but does not actually
     parse it.  Succeeds if A would fail.

     `"!a"'

     `(body ! a 1)'  change to: `(not-followed-by a)'

The first three were chosen to match the string representation, and
because *, + and ? are standard syntaxes for those ideas. The last two
were changed to words because I think & and ! aren't standard for the
ideas of followed-by and not-followed-by, and it would be hard to
remember which S-expression corresponded to which meaning, especially
since there is an S-expression syntax 'and' which is completely
different than '&'.

There's something I'm still a little unsure about, though. It's
possible to get deeply nested S-expressions, like '(+ (and (* "a")
"b")). Since +, * and ? only ever take one argument, it is possible to
shorten such a list by letting people merge those elements into the
next item, like this: '(+ and (* "a") "b").

On the other hand, that could get confusing if you later try to extend
peg syntax to match not just strings, but arbitrary sequences (which I
have been thinking of). For instance, say you want to match at least
one copy of the literal list '("a"). Is it (+ '("a")) or (+ quote
"a")?

What do you all think?

Noah



reply via email to

[Prev in Thread] Current Thread [Next in Thread]