bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RFC: doc: document new features of parse.error


From: Akim Demaille
Subject: RFC: doc: document new features of parse.error
Date: Mon, 27 Jan 2020 06:53:58 +0100

If someone feels s/he's good at writing technical material, I would appreciate 
some help.

This is not enough, I will have to add details about alias internationalization 
for instance.  And cover the other languages when we're done.

However, I think we should start using this features in real projects right now 
(maybe releasing an alpha would help) so that we can check that it does address 
the problems we found.  Of course I don't expect these projects to merge the PR 
yet, but we really need to see how these changes behave in the wild.

Cheers!


commit b7955e2a0943e998de5d0736c85a80bd480ce58c
Author: Akim Demaille <address@hidden>
Date:   Sat Jan 25 17:26:59 2020 +0100

    doc: document new features of parse.error
    
    * doc/bison.texi (Error Reporting): Rename as...
    (Error Reporting Function): this.
    Adjust dependencies.
    Make it a subsection of this...
    (Error Reporting): new section.
    (Syntax Error Reporting Function): New.
    (parse.error): Update description.

diff --git a/NEWS b/NEWS
index b764c81a..b4f38496 100644
--- a/NEWS
+++ b/NEWS
@@ -14,6 +14,67 @@ GNU Bison NEWS
   (2013-07-25), "%error-verbose" is deprecated in favor of "%define
   parse.error verbose".
 
+** New features
+
+*** Improved syntax error messages
+
+  Two new values for the %define parse.error variable offer more control to
+  the user.
+
+**** %define parse.error detailed
+
+  The behavior of "%define parse.error detailed" is closely resembling that
+  of "%define parse.error verbose" with a few exceptions.  First, it is safe
+  to use non-ASCII characters in token aliases (with 'verbose', the result
+  depends on the locale with which bison was run).  Second, a yysymbol_name
+  function is exposed to the user, instead of the yytnamerr function and the
+  yytname table.  Third, token internationalization is supported (see
+  below).
+
+**** %define parse.error custom
+
+  With this directive, the user forges and emits the syntax error message
+  herself by defining a function such as:
+
+    int
+    yyreport_syntax_error (const yyparse_context_t *ctx)
+    {
+      enum { ARGMAX = 10 };
+      int arg[ARGMAX];
+      int n = yysyntax_error_arguments (ctx, arg, ARGMAX);
+      if (n == -2)
+        return 2; // Memory exhausted.
+      YY_LOCATION_PRINT (stderr, *yyparse_context_location (ctx));
+      fprintf (stderr, ": syntax error");
+      for (int i = 1; i < n; ++i)
+        fprintf (stderr, " %s %s",
+                 i == 1 ? "expected" : "or", yysymbol_name (arg[i]));
+      if (n)
+        fprintf (stderr, " before %s", yysymbol_name (arg[0]));
+      fprintf (stderr, "\n");
+      return 0;
+    }
+
+**** Token aliases internationalization
+
+  When the %define variable parse.error is set to `custom` or `detailed`,
+  one may use the _() annotation to specify which token aliases are to be
+  translated.  For instance
+
+    %token
+        PLUS   "+"
+        MINUS  "-"
+        EOF 0  _("end of file")
+      <double>
+        NUM _("double precision number")
+      <symrec*>
+        FUN _("function")
+        VAR _("variable")
+
+  In that case the user must define _() and N_(), and yysymbol_name returns
+  the translated symbol (i.e., it returns '_("variable")' rather that
+  '"variable"').
+
 * Noteworthy changes in release 3.5.1 (2020-01-19) [stable]
 
 ** Bug fixes
@@ -3881,7 +3942,9 @@ along with this program.  If not, see 
<http://www.gnu.org/licenses/>.
  LocalWords:  Wdeprecated yytext Variadic variadic yyrhs yyphrs RCS README
  LocalWords:  noexcept constexpr ispell american deprecations backend Teoh
  LocalWords:  YYPRINT Mangold Bonzini's Wdangling exVal baz checkable gcc
- LocalWords:  fsanitize Vogelsgesang lis redeclared stdint automata
+ LocalWords:  fsanitize Vogelsgesang lis redeclared stdint automata yytname
+ LocalWords:  yysymbol yytnamerr yyreport ctx ARGMAX yysyntax stderr
+ LocalWords:  symrec
 
 Local Variables:
 ispell-dictionary: "american"
diff --git a/doc/bison.texi b/doc/bison.texi
index a3b947b0..844e21b5 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -305,7 +305,7 @@ Parser C-Language Interface
 * Parser Delete Function::  How to call @code{yypstate_delete} and what it 
returns.
 * Lexical::                 You must supply a function @code{yylex}
                               which reads tokens.
-* Error Reporting::         You must supply a function @code{yyerror}.
+* Error Reporting::         Passing error messages to the user.
 * Action Features::         Special features for use in actions.
 * Internationalization::    How to let the parser speak in the user's
                               native language.
@@ -322,6 +322,11 @@ The Lexical Analyzer Function @code{yylex}
 * Pure Calling::        How the calling convention differs in a pure parser
                           (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
 
+Error Reporting
+
+* Error Reporting Function::         You must supply a function @code{yyerror}.
+* Syntax Error Reporting Function::  You can supply a function 
@code{yyreport_syntax_error}.
+
 The Bison Parser Algorithm
 
 * Lookahead::         Parser looks one token ahead when deciding what to do.
@@ -5437,13 +5442,13 @@ reentrant.  It looks like this:
 
 The result is that the communication variables @code{yylval} and
 @code{yylloc} become local variables in @code{yyparse}, and a different
-calling convention is used for the lexical analyzer function
-@code{yylex}.  @xref{Pure Calling, ,Calling Conventions for Pure
-Parsers}, for the details of this.  The variable @code{yynerrs}
-becomes local in @code{yyparse} in pull mode but it becomes a member
-of @code{yypstate} in push mode.  (@pxref{Error Reporting, ,The Error
-Reporting Function @code{yyerror}}).  The convention for calling
-@code{yyparse} itself is unchanged.
+calling convention is used for the lexical analyzer function @code{yylex}.
+@xref{Pure Calling, ,Calling Conventions for Pure Parsers}, for the details
+of this.  The variable @code{yynerrs} becomes local in @code{yyparse} in
+pull mode but it becomes a member of @code{yypstate} in push mode.
+(@pxref{Error Reporting Function, ,The Error Reporting Function
+@code{yyerror}}).  The convention for calling @code{yyparse} itself is
+unchanged.
 
 Whether the parser is pure has nothing to do with the grammar rules.
 You can generate either a pure parser or a nonreentrant parser from any
@@ -6095,8 +6100,8 @@ used, then both parsers have the same signature:
 void yyerror (YYLTYPE *llocp, int *nastiness, char const *msg);
 @end example
 
-(@pxref{Error Reporting, ,The Error
-Reporting Function @code{yyerror}})
+(@pxref{Error Reporting Function, ,The Error Reporting Function
+@code{yyerror}})
 
 @item Default Value: @code{false}
 
@@ -6509,22 +6514,41 @@ constructed and destroyed properly.  This option checks 
these constraints.
 @item Languages(s):
 all
 @item Purpose:
-Control the kind of error messages passed to the error reporting
-function.  @xref{Error Reporting, ,The Error Reporting Function
-@code{yyerror}}.
+Control the generation syntax error messages.  @xref{Error Reporting}.
 @item Accepted Values:
 @itemize
 @item @code{simple}
 Error messages passed to @code{yyerror} are simply @w{@code{"syntax
 error"}}.
+
+@item @code{detailed}
+Error messages report the unexpected token, and possibly the expected ones.
+However, this report can often be incorrect when LAC is not enabled
+(@pxref{LAC}).  Token name internationalization is supported.
+
 @item @code{verbose}
+Similar (but inferior) to @code{detailed}.
+
 Error messages report the unexpected token, and possibly the expected ones.
 However, this report can often be incorrect when LAC is not enabled
 (@pxref{LAC}).
+
+Does not support token internationalization.  Using non-ASCII characters in
+token aliases is not portable.
+
+@item @code{custom}
+The user is in charge of generating the syntax error message by defining the
+@code{yyreport_syntax_error} function.  @xref{Syntax Error Reporting
+Function, ,The Syntax Error Reporting Function
+@code{yyreport_syntax_error}}.
 @end itemize
 
 @item Default Value:
 @code{simple}
+
+@item History:
+introduced in 3.0 with support for @code{simple} and @code{verbose}.  Values
+@code{custom} and @code{detailed} were introduced in 3.6.
 @end itemize
 @end deffn
 @c parse.error
@@ -6826,7 +6850,7 @@ in the grammar file, you are likely to run into trouble.
 * Parser Delete Function::  How to call @code{yypstate_delete} and what it 
returns.
 * Lexical::                 You must supply a function @code{yylex}
                               which reads tokens.
-* Error Reporting::         You must supply a function @code{yyerror}.
+* Error Reporting::         Passing error messages to the user.
 * Action Features::         Special features for use in actions.
 * Internationalization::    How to let the parser speak in the user's
                               native language.
@@ -7265,8 +7289,21 @@ int yylex   (YYSTYPE *lvalp, YYLTYPE *llocp,
 int yyparse (parser_mode *mode, environment_type *env);
 @end example
 
+
 @node Error Reporting
-@section The Error Reporting Function @code{yyerror}
+@section Error Reporting
+
+During its execution the parser may have error messages to pass to the user,
+such as syntax error, or memory exhaustion.  How this message is delivered
+to the user must be specified by the developer.
+
+@menu
+* Error Reporting Function::         You must supply a function @code{yyerror}.
+* Syntax Error Reporting Function::  You can supply a function 
@code{yyreport_syntax_error}.
+@end menu
+
+@node Error Reporting Function
+@subsection The Error Reporting Function @code{yyerror}
 @cindex error reporting function
 @findex yyerror
 @cindex parse error
@@ -7284,7 +7321,7 @@ called by @code{yyparse} whenever a syntax error is 
found, and it
 receives one argument.  For a syntax error, the string is normally
 @w{@code{"syntax error"}}.
 
-@findex %define parse.error
+@findex %define parse.error verbose
 If you invoke @samp{%define parse.error verbose} in the Bison declarations
 section (@pxref{Bison Declarations, ,The Bison Declarations Section}), then
 Bison provides a more verbose and specific error message string instead of
@@ -7352,13 +7389,76 @@ reported so far.  Normally this variable is global; but 
if you
 request a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser})
 then it is a local variable which only the actions can access.
 
+
+@node Syntax Error Reporting Function
+@subsection The Syntax Error Reporting Function @code{yyreport_syntax_error}
+
+@findex %define parse.error custom
+If you invoke @samp{%define parse.error custom} in the Bison declarations
+section (@pxref{Bison Declarations, ,The Bison Declarations Section}), then
+the parser no longer passes syntax error messages to @code{yyerror}, rather
+it leaves that task to the user by calling the @code{yyreport_syntax_error}
+function.
+
+@deftypefun int yyreport_syntax_error (@code{const yyparse_context_t 
*}@var{ctx})
+Report a syntax error to the user.  Return 0 on success, 2 on memory 
exhaustion.
+@end deftypefun
+
+Use the following functions to build the error message.
+
+@deftypefun {YYLTYPE *} yyparse_context_location (@code{const 
yyparse_context_t *}@var{ctx})
+The location of the syntax error.
+@end deftypefun
+
+
+@deftypefun int yysyntax_error_arguments (@code{const yyparse_context_t *}ctx, 
@code{int} @var{argv}@code{[]}, @code{int} @var{argc})
+Fill @var{argv} with first the internal number of the token that caused the
+error, then the internal numbers of the expected tokens.  Never put more
+than @var{argc} elements into @var{argv}, and on success return the
+effective number of numbers stored in @var{argv}, which can be 0.
+
+If @var{argv} is null, return the size needed to store all the possible
+values, which is always less than @code{YYNTOKENS}.  When LAC is enabled,
+may return -2 on memory exhaustion.
+@end deftypefun
+
+@deftypefun {const char *} yysymbol_name (@code{int} @var{symbol})
+The name of the symbol whose internal number is @var{symbol}, possibly
+translated.  Must be called with valid symbol numbers.
+@end deftypefun
+
+A custom syntax error function looks as follows.
+
+@example
+int
+yyreport_syntax_error (const yyparse_context_t *ctx)
+@{
+  enum @{ ARGMAX = 10 @};
+  int arg[ARGMAX];
+  int n = yysyntax_error_arguments (ctx, arg, ARGMAX);
+  if (n == -2)
+    return 2;
+  fprintf (stderr, "syntax error");
+  for (int i = 1; i < n; ++i)
+    fprintf (stderr, " %s %s",
+             i == 1 ? "expected" : "or", yysymbol_name (arg[i]));
+  if (n)
+    fprintf (stderr, " before %s", yysymbol_name (arg[0]));
+  fprintf (stderr, "\n");
+  return 0;
+@}
+@end example
+
+You still must provide a @code{yyerror} function, used for instance to
+report memory exhaustion.
+
 @node Action Features
 @section Special Features for Use in Actions
 @cindex summary, action features
 @cindex action features summary
 
-Here is a table of Bison constructs, variables and macros that
-are useful in actions.
+Here is a table of Bison constructs, variables and macros that are useful in
+actions.
 
 @deffn {Variable} $$
 Acts like a variable that contains the semantic value for the
@@ -13929,8 +14029,7 @@ token is reset to the token that originally caused the 
violation.
 @end deffn
 
 @deffn {Directive} %error-verbose
-An obsolete directive standing for @samp{%define parse.error verbose}
-(@pxref{Error Reporting, ,The Error Reporting Function @code{yyerror}}).
+An obsolete directive standing for @samp{%define parse.error verbose}.
 @end deffn
 
 @deffn {Directive} %file-prefix "@var{prefix}"
@@ -14155,7 +14254,7 @@ instead.
 
 @deffn {Function} yyerror
 User-supplied function to be called by @code{yyparse} on error.
-@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
+@xref{Error Reporting Function, ,The Error Reporting Function @code{yyerror}}.
 @end deffn
 
 @deffn {Macro} YYFPRINTF
@@ -14210,7 +14309,7 @@ Management}.
 Global variable which Bison increments each time it reports a syntax error.
 (In a pure parser, it is a local variable within @code{yyparse}. In a
 pure push parser, it is a member of @code{yypstate}.)
-@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
+@xref{Error Reporting Function, ,The Error Reporting Function @code{yyerror}}.
 @end deffn
 
 @deffn {Function} yyparse




reply via email to

[Prev in Thread] Current Thread [Next in Thread]