[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Guile-commits] GNU Guile branch, wip-peg-fixed, updated. v2.0.2-125-gf5
From: |
Noah Lavine |
Subject: |
[Guile-commits] GNU Guile branch, wip-peg-fixed, updated. v2.0.2-125-gf58bb99 |
Date: |
Mon, 19 Sep 2011 14:42:38 +0000 |
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU Guile".
http://git.savannah.gnu.org/cgit/guile.git/commit/?id=f58bb9989fa3b091d24bf2c11370b39436386054
The branch, wip-peg-fixed has been updated
via f58bb9989fa3b091d24bf2c11370b39436386054 (commit)
via 6a74e135a99b8c61639d43b5e3aaa4a529bddd2d (commit)
via ec9b02820947daa9bd09a2dc5b9f30bbb6221488 (commit)
via 17ad327deff0caf804356c87121cdbcd8ecde4c9 (commit)
via a2a4779e23078d9a40bfd6a195fa3681acfc2a05 (commit)
via 66f09066fb467cfc38bc26a97c9f9c620917f18a (commit)
via fb59c173a55471bb7885d12cf627569f50a50bb5 (commit)
via d32a1ecb0b1880e3b0715f507d32f91ce7c2971d (commit)
from 02be220d042a9b50675441cae5e2945e6d4f37e2 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
commit f58bb9989fa3b091d24bf2c11370b39436386054
Author: Noah Lavine <address@hidden>
Date: Mon Sep 19 10:40:28 2011 -0400
Comments in PEG
module/ice-9/peg/string-peg.scm: add comments explaining the format of some
of the parsed PEG forms.
commit 6a74e135a99b8c61639d43b5e3aaa4a529bddd2d
Author: Noah Lavine <address@hidden>
Date: Mon Sep 19 10:36:06 2011 -0400
Remove 'body' PEG
module/ice-9/peg/string-peg.scm: update S-expression generators to use the
new *, +, ?, followed-by, and not-followed-by forms.
module/ice-9/peg/codegen.scm: remove the 'body' form in the PEG s-expression
representation.
commit ec9b02820947daa9bd09a2dc5b9f30bbb6221488
Author: Noah Lavine <address@hidden>
Date: Mon Sep 19 10:33:09 2011 -0400
Add 'not-followed-by' PEG
The PEG s-expression syntax now uses '(not-followed-by ...)' instead of
'(body ! ... 1)'.
commit 17ad327deff0caf804356c87121cdbcd8ecde4c9
Author: Noah Lavine <address@hidden>
Date: Mon Sep 19 10:30:53 2011 -0400
Add 'followed-by' PEG
The PEG s-expression syntax now uses '(followed-by ...)' instead of
'(body & ... 1)'.
commit a2a4779e23078d9a40bfd6a195fa3681acfc2a05
Author: Noah Lavine <address@hidden>
Date: Mon Sep 19 10:28:35 2011 -0400
Add '?' PEG
The PEG s-expression syntax now uses '(? ...)' instead of '(body lit ...
?)'.
commit 66f09066fb467cfc38bc26a97c9f9c620917f18a
Author: Noah Lavine <address@hidden>
Date: Mon Sep 19 10:26:56 2011 -0400
Add '+' PEG
The PEG s-expression syntax now uses '(+ ...)' instead of '(body lit ...
+)'.
commit fb59c173a55471bb7885d12cf627569f50a50bb5
Author: Noah Lavine <address@hidden>
Date: Mon Sep 19 10:24:56 2011 -0400
Add '*' PEG
The s-expression representation of PEG grammars now uses a '(* ...)' form
instead of '(body lit ... *)'.
commit d32a1ecb0b1880e3b0715f507d32f91ce7c2971d
Author: Noah Lavine <address@hidden>
Date: Tue Sep 6 12:14:21 2011 -0400
Clean Up PEG Codegen
* module/ice-9/peg/codegen.scm: remove unnecessary literals in
peg-sexp-compile.
-----------------------------------------------------------------------
Summary of changes:
doc/ref/api-peg.texi | 14 ++--
module/ice-9/peg/codegen.scm | 119 +++++++++++++++++++++++++++++---------
module/ice-9/peg/string-peg.scm | 76 ++++++++++++++----------
3 files changed, 142 insertions(+), 67 deletions(-)
diff --git a/doc/ref/api-peg.texi b/doc/ref/api-peg.texi
index 6d0a346..111f150 100644
--- a/doc/ref/api-peg.texi
+++ b/doc/ref/api-peg.texi
@@ -69,7 +69,7 @@ succeeds.
@code{"a*"}
address@hidden(body lit a *)}
address@hidden(* a)}
@end deftp
@deftp {PEG Pattern} {one or more} a
@@ -79,7 +79,7 @@ least one @var{a} was parsed.
@code{"a+"}
address@hidden(body lit a +)}
address@hidden(+ a)}
@end deftp
@deftp {PEG Pattern} optional a
@@ -87,25 +87,25 @@ Tries to parse @var{a}. Succeeds if @var{a} succeeds.
@code{"a?"}
address@hidden(body lit a ?)}
address@hidden(? a)}
@end deftp
address@hidden {PEG Pattern} {and predicate} a
address@hidden {PEG Pattern} {followed by} a
Makes sure it is possible to parse @var{a}, but does not actually parse
it. Succeeds if @var{a} would succeed.
@code{"&a"}
address@hidden(body & a 1)}
address@hidden(followed-by a)}
@end deftp
address@hidden {PEG Pattern} {not predicate} a
address@hidden {PEG Pattern} {not followed by} a
Makes sure it is impossible to parse @var{a}, but does not actually
parse it. Succeeds if @var{a} would fail.
@code{"!a"}
address@hidden(body ! a 1)}
address@hidden(not-followed-by a)}
@end deftp
@deftp {PEG Pattern} {string literal} ``abc''
diff --git a/module/ice-9/peg/codegen.scm b/module/ice-9/peg/codegen.scm
index 597ead9..372f7eb 100644
--- a/module/ice-9/peg/codegen.scm
+++ b/module/ice-9/peg/codegen.scm
@@ -197,10 +197,9 @@ return EXP."
#`(or (#,(peg-sexp-compile #'first accum) #,str #,strlen #,at)
#,(cg-or-int #'(rest ...) accum str strlen at)))))
-;; Returns a function that parses a BODY element.
-(define (cg-body args accum)
+(define (cg-* args accum)
(syntax-case args ()
- ((type pat num)
+ ((pat)
#`(lambda (str strlen at)
(let ((body '()))
(let lp ((end at) (count 0))
@@ -211,31 +210,91 @@ return EXP."
(if (> new-end end)
(push-not-null! body (single-filter (cadr match))))
(if (and (> new-end end)
- #,(syntax-case #'num (+ * ?)
- (n (number? (syntax->datum #'n))
- #'(< count n))
- (+ #t)
- (* #t)
- (? #'(< count 1))))
+ #,#t)
(lp new-end count)
- (let ((success #,(syntax-case #'num (+ * ?)
- (n (number? (syntax->datum #'n))
- #'(= count n))
- (+ #'(>= count 1))
- (* #t)
- (? #t))))
- #,(syntax-case #'type (! & lit)
- (!
- #`(if success
- #f
- #,(cggr (baf accum) 'cg-body #''() #'at)))
- (&
- #`(and success
- #,(cggr (baf accum) 'cg-body #''() #'at)))
- (lit
- #`(and success
+ (let ((success #,#t))
+ #,#`(and success
+ #,(cggr (baf accum) 'cg-body
+ #'(reverse body) #'new-end)))))))))))
+
+(define (cg-+ args accum)
+ (syntax-case args ()
+ ((pat)
+ #`(lambda (str strlen at)
+ (let ((body '()))
+ (let lp ((end at) (count 0))
+ (let* ((match (#,(peg-sexp-compile #'pat (baf accum))
+ str strlen end))
+ (new-end (if match (car match) end))
+ (count (if (> new-end end) (1+ count) count)))
+ (if (> new-end end)
+ (push-not-null! body (single-filter (cadr match))))
+ (if (and (> new-end end)
+ #,#t)
+ (lp new-end count)
+ (let ((success #,#'(>= count 1)))
+ #,#`(and success
+ #,(cggr (baf accum) 'cg-body
+ #'(reverse body) #'new-end)))))))))))
+
+(define (cg-? args accum)
+ (syntax-case args ()
+ ((pat)
+ #`(lambda (str strlen at)
+ (let ((body '()))
+ (let lp ((end at) (count 0))
+ (let* ((match (#,(peg-sexp-compile #'pat (baf accum))
+ str strlen end))
+ (new-end (if match (car match) end))
+ (count (if (> new-end end) (1+ count) count)))
+ (if (> new-end end)
+ (push-not-null! body (single-filter (cadr match))))
+ (if (and (> new-end end)
+ #,#'(< count 1))
+ (lp new-end count)
+ (let ((success #,#t))
+ #,#`(and success
#,(cggr (baf accum) 'cg-body
- #'(reverse body)
#'new-end)))))))))))))
+ #'(reverse body) #'new-end)))))))))))
+
+(define (cg-followed-by args accum)
+ (syntax-case args ()
+ ((pat)
+ #`(lambda (str strlen at)
+ (let ((body '()))
+ (let lp ((end at) (count 0))
+ (let* ((match (#,(peg-sexp-compile #'pat (baf accum))
+ str strlen end))
+ (new-end (if match (car match) end))
+ (count (if (> new-end end) (1+ count) count)))
+ (if (> new-end end)
+ (push-not-null! body (single-filter (cadr match))))
+ (if (and (> new-end end)
+ #,#'(< count 1))
+ (lp new-end count)
+ (let ((success #,#'(= count 1)))
+ #,#`(and success
+ #,(cggr (baf accum) 'cg-body #''()
#'at)))))))))))
+
+(define (cg-not-followed-by args accum)
+ (syntax-case args ()
+ ((pat)
+ #`(lambda (str strlen at)
+ (let ((body '()))
+ (let lp ((end at) (count 0))
+ (let* ((match (#,(peg-sexp-compile #'pat (baf accum))
+ str strlen end))
+ (new-end (if match (car match) end))
+ (count (if (> new-end end) (1+ count) count)))
+ (if (> new-end end)
+ (push-not-null! body (single-filter (cadr match))))
+ (if (and (> new-end end)
+ #,#'(< count 1))
+ (lp new-end count)
+ (let ((success #,#'(= count 1)))
+ #,#`(if success
+ #f
+ #,(cggr (baf accum) 'cg-body #''()
#'at)))))))))))
;; Association list of functions to handle different expressions as PEGs
(define peg-compiler-alist '())
@@ -249,12 +308,16 @@ return EXP."
(add-peg-compiler! 'capture cg-capture)
(add-peg-compiler! 'and cg-and)
(add-peg-compiler! 'or cg-or)
-(add-peg-compiler! 'body cg-body)
+(add-peg-compiler! '* cg-*)
+(add-peg-compiler! '+ cg-+)
+(add-peg-compiler! '? cg-?)
+(add-peg-compiler! 'followed-by cg-followed-by)
+(add-peg-compiler! 'not-followed-by cg-not-followed-by)
;; Takes an arbitrary expressions and accumulation variable, then parses it.
;; E.g.: (peg-sexp-compile syntax '(and "abc" (or "-" (range #\a #\z))) 'all)
(define (peg-sexp-compile pat accum)
- (syntax-case pat (peg-any range ignore capture peg and or body)
+ (syntax-case pat (peg-any)
(peg-any
(cg-peg-any (baf accum)))
(sym (identifier? #'sym) ;; nonterminal
diff --git a/module/ice-9/peg/string-peg.scm b/module/ice-9/peg/string-peg.scm
index ed09aae..849e742 100644
--- a/module/ice-9/peg/string-peg.scm
+++ b/module/ice-9/peg/string-peg.scm
@@ -65,34 +65,34 @@ RB < ']'
#`(define sym #,syn))))))
(define-sexp-parser peg-grammar all
- (body lit (and peg-nonterminal (or "<--" "<-" "<") peg-sp peg-pattern) +))
+ (+ (and peg-nonterminal (or "<--" "<-" "<") peg-sp peg-pattern)))
(define-sexp-parser peg-pattern all
(and peg-alternative
- (body lit (and (ignore "/") peg-sp peg-alternative) *)))
+ (* (and (ignore "/") peg-sp peg-alternative))))
(define-sexp-parser peg-alternative all
- (body lit (and (body lit (or "!" "&") ?) peg-sp peg-suffix) +))
+ (+ (and (? (or "!" "&")) peg-sp peg-suffix)))
(define-sexp-parser peg-suffix all
- (and peg-primary (body lit (and (or "*" "+" "?") peg-sp) *)))
+ (and peg-primary (* (and (or "*" "+" "?") peg-sp))))
(define-sexp-parser peg-primary all
(or (and "(" peg-sp peg-pattern ")" peg-sp)
(and "." peg-sp)
peg-literal
peg-charclass
- (and peg-nonterminal (body ! "<" 1))))
+ (and peg-nonterminal (not-followed-by "<"))))
(define-sexp-parser peg-literal all
- (and "'" (body lit (and (body ! "'" 1) peg-any) *) "'" peg-sp))
+ (and "'" (* (and (not-followed-by "'") peg-any)) "'" peg-sp))
(define-sexp-parser peg-charclass all
(and (ignore "[")
- (body lit (and (body ! "]" 1)
- (or charclass-range charclass-single)) *)
+ (* (and (not-followed-by "]")
+ (or charclass-range charclass-single)))
(ignore "]")
peg-sp))
(define-sexp-parser charclass-range all (and peg-any "-" peg-any))
(define-sexp-parser charclass-single all peg-any)
(define-sexp-parser peg-nonterminal all
- (and (body lit (or (range #\a #\z) (range #\A #\Z) (range #\0 #\9) "-") +)
peg-sp))
+ (and (+ (or (range #\a #\z) (range #\A #\Z) (range #\0 #\9) "-")) peg-sp))
(define-sexp-parser peg-sp none
- (body lit (or " " "\t" "\n") *))
+ (* (or " " "\t" "\n")))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;; PARSE STRING PEGS
@@ -125,7 +125,10 @@ RB < ']'
(peg-parser (syntax->datum #'str) x)))))
(define define-grammar-f peg-parser)
-;; Parse a nonterminal and pattern listed in LST.
+;; lst has format (nonterm grabber pattern), where
+;; nonterm is a symbol (the name of the nonterminal),
+;; grabber is a string (either "<", "<-" or "<--"), and
+;; pattern is the parse of a PEG pattern expressed as as string.
(define (peg-nonterm->defn lst for-syntax)
(let* ((nonterm (car lst))
(grabber (cadr lst))
@@ -139,41 +142,50 @@ RB < ']'
(else (datum->syntax for-syntax 'none)))
#,(compressor (peg-pattern->defn pattern for-syntax) for-syntax))))
-;; Parse a pattern.
+;; lst has format ('peg-pattern ...).
+;; After the context-flatten, (cdr lst) has format
+;; (('peg-alternative ...) ...), where the outer list is a collection
+;; of elements from a '/' alternative.
(define (peg-pattern->defn lst for-syntax)
#`(or #,@(map (lambda (x) (peg-alternative->defn x for-syntax))
(context-flatten (lambda (x) (eq? (car x) 'peg-alternative))
(cdr lst)))))
-;; Parse an alternative.
+;; lst has format ('peg-alternative ...).
+;; After the context-flatten, (cdr lst) has the format
+;; (item ...), where each item has format either ("!" ...), ("&" ...),
+;; or ('peg-suffix ...).
(define (peg-alternative->defn lst for-syntax)
#`(and #,@(map (lambda (x) (peg-body->defn x for-syntax))
(context-flatten (lambda (x) (or (string? (car x))
(eq? (car x) 'peg-suffix)))
(cdr lst)))))
-;; Parse a body.
+;; lst has the format either
+;; ("!" ('peg-suffix ...)), ("&" ('peg-suffix ...)), or
+;; ('peg-suffix ...).
(define (peg-body->defn lst for-syntax)
- (let ((suffix '())
- (front (datum->syntax for-syntax 'lit)))
(cond
- ((eq? (car lst) 'peg-suffix)
- (set! suffix lst))
- ((string? (car lst))
- (begin (set! front (datum->syntax for-syntax
- (string->symbol (car lst))))
- (set! suffix (cadr lst))))
- (else `(peg-parse-body-fail ,lst)))
- #`(body #,front #,@(peg-suffix->defn
- suffix
- for-syntax))))
+ ((equal? (car lst) "&")
+ #`(followed-by #,(peg-suffix->defn (cadr lst) for-syntax)))
+ ((equal? (car lst) "!")
+ #`(not-followed-by #,(peg-suffix->defn (cadr lst) for-syntax)))
+ ((eq? (car lst) 'peg-suffix)
+ (peg-suffix->defn lst for-syntax))
+ (else `(peg-parse-body-fail ,lst))))
-;; Parse a suffix.
+;; lst has format ('peg-suffix <peg-primary> (? (/ "*" "?" "+")))
(define (peg-suffix->defn lst for-syntax)
- #`(#,(peg-primary->defn (cadr lst) for-syntax)
- #,(if (null? (cddr lst))
- 1
- (datum->syntax for-syntax (string->symbol (caddr lst))))))
+ (let ((inner-defn (peg-primary->defn (cadr lst) for-syntax)))
+ (cond
+ ((null? (cddr lst))
+ inner-defn)
+ ((equal? (caddr lst) "*")
+ #`(* #,inner-defn))
+ ((equal? (caddr lst) "?")
+ #`(? #,inner-defn))
+ ((equal? (caddr lst) "+")
+ #`(+ #,inner-defn)))))
;; Parse a primary.
(define (peg-primary->defn lst for-syntax)
hooks/post-receive
--
GNU Guile
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Guile-commits] GNU Guile branch, wip-peg-fixed, updated. v2.0.2-125-gf58bb99,
Noah Lavine <=