[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
More PEG
From: |
Noah Lavine |
Subject: |
More PEG |
Date: |
Thu, 8 Sep 2011 22:38:42 -0400 |
Hello all,
It looks to me like the last thing needed before the peg branch can be
used is to change some of the S-expression representations of the
components. Here are the five that I think need changing, taken from
the manual, with suggested replacements.
-- PEG Pattern: zero or more a
Parses A as many times in a row as it can, starting each A at the
end of the text parsed by the previous A. Always succeeds.
`"a*"'
`(body lit a *)' change to: (* a)
-- PEG Pattern: one or more a
Parses A as many times in a row as it can, starting each A at the
end of the text parsed by the previous A. Succeeds if at least
one A was parsed.
`"a+"'
`(body lit a +)' change to: (+ a)
-- PEG Pattern: optional a
Tries to parse A. Succeeds if A succeeds.
`"a?"'
`(? a)'
Old: `(body lit a ?)' change to: (? a)
-- PEG Pattern: and predicate a
Makes sure it is possible to parse A, but does not actually parse
it. Succeeds if A would succeed.
`"&a"'
`(body & a 1)' change to: `(followed-by a)'
-- PEG Pattern: not predicate a
Makes sure it is impossible to parse A, but does not actually
parse it. Succeeds if A would fail.
`"!a"'
`(body ! a 1)' change to: `(not-followed-by a)'
The first three were chosen to match the string representation, and
because *, + and ? are standard syntaxes for those ideas. The last two
were changed to words because I think & and ! aren't standard for the
ideas of followed-by and not-followed-by, and it would be hard to
remember which S-expression corresponded to which meaning, especially
since there is an S-expression syntax 'and' which is completely
different than '&'.
There's something I'm still a little unsure about, though. It's
possible to get deeply nested S-expressions, like '(+ (and (* "a")
"b")). Since +, * and ? only ever take one argument, it is possible to
shorten such a list by letting people merge those elements into the
next item, like this: '(+ and (* "a") "b").
On the other hand, that could get confusing if you later try to extend
peg syntax to match not just strings, but arbitrary sequences (which I
have been thinking of). For instance, say you want to match at least
one copy of the literal list '("a"). Is it (+ '("a")) or (+ quote
"a")?
What do you all think?
Noah