emacs-elpa-diffs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[elpa] master bf49fb6 38/60: Upate README


From: Junpeng Qiu
Subject: [elpa] master bf49fb6 38/60: Upate README
Date: Tue, 25 Oct 2016 17:45:15 +0000 (UTC)

branch: master
commit bf49fb6f5d9067ea41f502782df53770957f28e5
Author: Junpeng Qiu <address@hidden>
Commit: Junpeng Qiu <address@hidden>

    Upate README
---
 README.org |  258 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 258 insertions(+)

diff --git a/README.org b/README.org
index b4f8e2c..b439c25 100644
--- a/README.org
+++ b/README.org
@@ -1,3 +1,261 @@
 #+TITLE: parsec.el
 
 A parser combinator library for Emacs Lisp similar to Haskell's Parsec library.
+
+* Overview
+
+This work is based on [[https://github.com/jwiegley/][John Wiegley]]'s 
[[https://github.com/jwiegley/emacs-pl][emacs-pl]]. The original 
[[https://github.com/jwiegley/emacs-pl][emacs-pl]] is awesome,
+but I found following problems when I tried to use it:
+
+- It only contains a very limited set of combinators
+- Some of its functions (combinators) have different behaviors than their
+  Haskell counterparts
+- It can't show error messages when parsing fails
+
+So I decided to make a new library on top of it. This library, however, 
contains
+most of the parser combinators in =Text.Parsec.Combinator=, which should be
+enough in most use cases. Of course more combinators can be added if necessary!
+Most of the parser combinators have the same behavior as their Haskell
+counterparts. =parsec.el= also comes with a simple error handling mechanism so
+that it can display an error message showing how the parser fails.
+
+So we can
+
+- use these parser combinators to write parsers easily from scratch in Emacs
+  Lisp like what we can do in Haskell
+- port existing Haskell program using Parsec to its equivalent Emacs Lisp
+  program easily
+
+* Parsing Functions & Parser Combinators
+
+  We compare the functions and macros defined in this library with their 
Haskell
+  counterparts, assuming you're already familiar with Haskell's Parsec. If you
+  don't have any experience with parser combinators, look at the docstrings of
+  these functions and macros and try them to see the results! They are really
+  easy to learn and use!
+
+** Basic Parsing Functions
+   These parsing functions are used as the basic building block for a parser. 
By
+   default, their return value is a string.
+
+  | parsec.el              | Haskell's Parsec | Usage                          
                       |
+  
|------------------------+------------------+-------------------------------------------------------|
+  | parsec-ch              | char             | parse a character              
                       |
+  | parsec-any-ch          | anyChar          | parse an arbitrary character   
                       |
+  | parsec-satisfy         | satisfy          | parse a character satisfying a 
predicate              |
+  | parsec-newline         | newline          | parse '\n'                     
                       |
+  | parsec-crlf            | crlf             | parse '\r\n'                   
                       |
+  | parsec-eol             | eol              | parse newline or CRLF          
                       |
+  | parsec-eof, parsec-eob | eof              | parse end of file              
                       |
+  | parsec-eol-or-eof      | *N/A*            | parse EOL or EOL               
                       |
+  | parsec-re              | *N/A*            | parse using a regular 
expression                      |
+  | parsec-one-of          | oneOf            | parse one of the characters    
                       |
+  | parsec-none-of         | noneOf           | parse any character other than 
the supplied ones      |
+  | parsec-str             | *N/A*            | parse a string but consume 
input only when successful |
+  | parsec-string          | string           | parse a string and consume 
input for partial matches  |
+  | parsec-num             | *N/A*            | parse a number                 
                       |
+  | parsec-letter          | letter           | parse a letter                 
                       |
+  | parsec-digit           | digit            | parse a digit                  
                       |
+
+  Note:
+  - =parsec-str= and =parsec-string= are different. =parsec-string= behaves the
+    same as =string= in Haskell, and =parsec-str= is more like combining
+    =string= and =try= in Haskell.
+  - Use the power of regular expressions provided by =parsec-re= and simplify 
the parser!
+
+** Parser Combinators
+   These combinators can be used to combine different parsers.
+
+  | parsec.el                 | Haskell's Parsec | Usage                       
                                 |
+  
|---------------------------+------------------+--------------------------------------------------------------|
+  | parsec-or                 | choice           | try the parsers until one 
succeeds                           |
+  | parsec-try                | try              | try parser and consume no 
input when an error occurs         |
+  | parsec-with-error-message | <?> (similar)    | use the new error message 
when an error occurs               |
+  | parsec-many               | many             | apply the parser zero or 
more times                          |
+  | parsec-many1              | many1            | apply the parser one or 
more times                           |
+  | parsec-many-till          | manyTill         | apply parser zero or more 
times until end succeeds           |
+  | parsec-until              | *N/A*            | parse until end succeeds    
                                 |
+  | parsec-not-followed-by    | notFollowedBy    | succeed when the parser 
fails                                |
+  | parsec-endby              | endby            | apply parser zero or more 
times, separated and ended by end  |
+  | parsec-sepby              | sepby            | apply parser zero or more 
times, separated by sep            |
+  | parsec-between            | between          | apply parser between open 
and close                          |
+  | parsec-count              | count            | apply parser n times        
                                 |
+  | parsec-option             | option           | apply parser, if it fails, 
return opt                        |
+  | parsec-optional           | *N/A*            | apply parser zero or one 
time and return the result          |
+  | parsec-optional*          | optional         | apply parser zero or one 
time and discard the result         |
+  | parsec-optional-maybe     | optionMaybe      | apply parser zero or one 
time and return the result in Maybe |
+
+  Note:
+  - =parsec-or= can also be used to replace =<|>=.
+  - =parsec-with-error-message= is slightly different from =<?>=. It will
+    replace the error message even when the input is consumed.
+  - By default, =parsec-many-till= behaves as Haskell's =manyTill=. However,
+    =parsec-many-till= and =parsec-until= can accept an optional argument to
+    specify which part(s) to be returned. You can use =:both= or =:end= as the
+    optional argument to change the default behavior. See the docstrings for
+    more information.
+
+** Parser Utilities
+   These utilities can be used together with parser combinators to build a
+   parser and ease the translation process if you're trying to port an existing
+   Haskell program.
+
+  | parsec.el                        | Haskell's Parsec | Usage                
                                   |
+  
|----------------------------------+------------------+---------------------------------------------------------|
+  | parsec-and                       | do block         | try all parsers and 
return the last result              |
+  | parsec-return                    | do block         | try all parsers and 
return the first result             |
+  | parsec-ensure                    | *N/A*            | quit the parsing 
when an error occurs                   |
+  | parsec-ensure-with-error-message | *N/A*            | quit the parsing 
when an error occurs with new message  |
+  | parsec-collect                   | sequence         | try all parsers and 
collect the results into a list     |
+  | parsec-collect*                  | *N/A*            | try all parsers and 
collect non-nil results into a list |
+  | parsec-start                     | parse            | entry point          
                                   |
+  | parsec-parse                     | parse            | entry point (same as 
parsec-start)                      |
+  | parsec-with-input                | parse            | perform parsers on 
input                                |
+  | parsec-from-maybe                | fromMaybe        | retrieve value from 
Maybe                               |
+  | parsec-maybe-p                   | *N/A*            | is a Maybe value or 
not                                 |
+  | parsec-query                     | *N/A*            | change the parser's 
return value                        |
+
+** Variants that Return a String
+
+   By default, the macros/functions that return multiple values will put the
+   values into a list. These macros/functions are:
+   - =parsec-many=
+   - =parsec-many1=
+   - =parsec-many-till=
+   - =parsec-until=
+   - =parsec-count=
+   - =parsec-collect= and =parsec-collect*=
+
+   They all have a variant that returns a string by concatenating the results 
in
+   the list:
+   - =parsec-many-as-string=
+   - =parsec-many1-as-string=
+   - =parsec-many-till-as-string=
+   - =parsec-until-as-string=
+   - =parsec-collect-as-string=
+
+   These variants accept the same arguments. The only difference is the return
+   value. In most cases I found myself using these variants instead of the
+   original versions that return a list.
+
+* Code Examples
+  Some very simple examples are given here. You can see many code examples in
+  the test files in this GitHub repo.
+
+  The following code extract the "hello" from the comment:
+  #+BEGIN_SRC elisp
+  (parsec-with-input "/* hello */"
+    (parsec-string "/*")
+    (parsec-many-till-as-string (parsec-any-ch)
+                                (parsec-try
+                                 (parsec-string "*/"))))
+  #+END_SRC
+
+  THe equivalent Haskell program:
+  #+BEGIN_SRC haskell
+  import           Text.Parsec
+
+  main :: IO ()
+  main = print $ parse p "" "/* hello */"
+    where
+      p = do string "/*"
+             manyTill anyChar (try (string "*/"))
+  #+END_SRC
+
+  The following code returns the "aeiou" before "end":
+  #+BEGIN_SRC elisp
+  (parsec-with-input "if aeiou end"
+    (parsec-str "if ")
+    (parsec-return
+        (parsec-many-as-string (parsec-one-of ?a ?e ?i ?o ?u))
+      (parsec-str " end")))
+  #+END_SRC
+
+* Parser Examples
+  I translate some Haskell Parsec examples into Emacs Lisp using =parsec.el=.
+  You can see from these examples that it is very easy to write parsers using
+  =parsec.el=, and if you know haskell, you can see that basically I just
+  translate the Haskell into Emacs Lisp one by one because most of them are 
just
+  the same!
+
+  You can find five examples under the =examples/= directory.
+
+  Three of the examples are taken from the chapter 
[[http://book.realworldhaskell.org/read/using-parsec.html][Using Parsec]] in 
the book of
+  [[http://book.realworldhaskell.org/read/][Real World Haskell]]:
+  - =simple-csv-parser.el=: a simple csv parser with no support for quoted 
cells
+  - =full-csv-parser.el=: a full csv parser
+  - =url-str-parser.el=: parser parameters in URL
+
+  =pjson.el= is a translation of Haskell's 
[[https://hackage.haskell.org/package/json-0.9.1/docs/src/Text-JSON-Parsec.html][json
 library using Parsec]].
+
+  =scheme.el= is a much simplified Scheme parser based on 
[[https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/][Write 
Yourself a
+  Scheme in 48 Hours]].
+
+  They're really simple but you can see how this library works!
+
+* Change the Return Values using =parsec-query=
+  Parsing has side-effects such as forwarding the current point. In the 
original
+  [[https://github.com/jwiegley/emacs-pl][emacs-pl]], you can specify some 
optional arguments to some parsing functions
+  (=pl-ch=, =pl-re= etc.) to change the return values. In =parsec.el=, these
+  functions don't have such a behavior. Instead, we provide a unified interface
+  =parsec-query=, which accepts any parser, and changes the return value of the
+  parser.
+
+  You can speicify following arguments:
+  #+BEGIN_EXAMPLE
+  :beg      --> return the point before applying the PARSER
+  :end      --> return the point after applying the PARSER
+  :nil      --> return nil
+  :groups N --> return Nth group for `parsec-re'."
+  #+END_EXAMPLE
+
+  So instead of returning "b" as the result, the following code returns 2:
+  #+BEGIN_SRC elisp
+  (parsec-with-input "ab"
+    (parsec-ch ?a)
+    (parsec-query (parsec-ch ?b) :beg))
+  #+END_SRC
+
+  Returning a point means that you can also incorporate =parsec.el= with Emacs
+  Lisp functions that can operate on points/regions, such as =goto-char= and
+  =kill-region=.
+
+* Error Messages
+
+  =parsec.el= implements a simple error handling mechanism. When an error
+  happens, it will show how the parser fails.
+
+  For example, the following code fails:
+  #+BEGIN_SRC elisp
+  (parsec-with-input "aac"
+    (parsec-count 2 (parsec-ch ?a))
+    (parsec-ch ?b))
+  #+END_SRC
+
+  The return value is:
+  #+BEGIN_SRC elisp
+  (parsec-error . "Found \"c\" -> Expected \"b\"")
+  #+END_SRC
+
+  This also works when parser combinators fail:
+  #+BEGIN_SRC elisp
+  (parsec-with-input "a"
+    (parsec-or (parsec-ch ?b)
+               (parsec-ch ?c)))
+  #+END_SRC
+
+  The return value is:
+  #+BEGIN_SRC elisp
+  (parsec-error . "None of the parsers succeeds:
+       Found \"a\" -> Expected \"c\"
+       Found \"a\" -> Expected \"b\"")
+  #+END_SRC
+
+  If an error occurs, the return value is a cons cell that contains the error
+  message in its =cdr=. Compared to Haskell's Parsec, it's really simple, but 
at
+  least the error message could tell us some information. Yeah, not perfect but
+  usable.
+
+* Acknowledgement
+  - Daan Leijen for Haskell's Parsec
+  - [[https://github.com/jwiegley/][John Wiegley]] for 
[[https://github.com/jwiegley/emacs-pl][emacs-pl]]



reply via email to

[Prev in Thread] Current Thread [Next in Thread]