info-gnu
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bison-3.0 released [stable]


From: Akim Demaille
Subject: bison-3.0 released [stable]
Date: Thu, 1 Aug 2013 11:09:01 +0200

The Bison team is very happy to announce the release of Bison 3.0, which
introduces many new features.  An executive summary would include: (i) deep
overhaul/improvements of the diagnostics, (ii) more versatile means to
describe semantic value types (including the ability to store genuine C++
objects in C++ parsers), (iii) push-parser interface extended to Java, and
(iv) parse-time semantic predicates for GLR parsers.

Here are the compressed sources:
 ftp://ftp.gnu.org/gnu/bison/bison-3.0.tar.gz   (3.1MB)
 ftp://ftp.gnu.org/gnu/bison/bison-3.0.tar.xz   (1.8MB)

Here are the GPG detached signatures[*]:
 ftp://ftp.gnu.org/gnu/bison/bison-3.0.tar.gz.sig
 ftp://ftp.gnu.org/gnu/bison/bison-3.0.tar.xz.sig

Use a mirror for higher download bandwidth:
 http://www.gnu.org/order/ftp.html

[*] Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact.  First, be sure to download both the .sig file
and the corresponding tarball.  Then, run a command like this:

 gpg --verify bison-3.0.tar.gz.sig

If that command fails because you don't have the required public key,
then run this command to import it:

 gpg --keyserver keys.gnupg.net --recv-keys 0DDCAA3278D5264E

and rerun the 'gpg --verify' command.

This release was bootstrapped with the following tools:
 Autoconf 2.69
 Automake 1.14
 Flex 2.5.37
 Gettext 0.18.3
 Gnulib v0.0-7982-g03e96cc

NEWS

* Noteworthy changes in release 3.0 (2013-07-25) [stable]

** WARNING: Future backward-incompatibilities!

 Like other GNU packages, Bison will start using some of the C99 features
 for its own code, especially the definition of variables after statements.
 The generated C parsers still aim at C90.

** Backward incompatible changes

*** Obsolete features

 Support for YYFAIL is removed (deprecated in Bison 2.4.2): use YYERROR.

 Support for yystype and yyltype is removed (deprecated in Bison 1.875):
 use YYSTYPE and YYLTYPE.

 Support for YYLEX_PARAM and YYPARSE_PARAM is removed (deprecated in Bison
 1.875): use %lex-param, %parse-param, or %param.

 Missing semicolons at the end of actions are no longer added (as announced
 in the release 2.5).

*** Use of YACC='bison -y'

 TL;DR: With Autoconf <= 2.69, pass -Wno-yacc to (AM_)YFLAGS if you use
 Bison extensions.

 Traditional Yacc generates 'y.tab.c' whatever the name of the input file.
 Therefore Makefiles written for Yacc expect 'y.tab.c' (and possibly
 'y.tab.h' and 'y.outout') to be generated from 'foo.y'.

 To this end, for ages, AC_PROG_YACC, Autoconf's macro to look for an
 implementation of Yacc, was using Bison as 'bison -y'.  While it does
 ensure compatible output file names, it also enables warnings for
 incompatibilities with POSIX Yacc.  In other words, 'bison -y' triggers
 warnings for Bison extensions.

 Autoconf 2.70+ fixes this incompatibility by using YACC='bison -o y.tab.c'
 (which also generates 'y.tab.h' and 'y.output' when needed).
 Alternatively, disable Yacc warnings by passing '-Wno-yacc' to your Yacc
 flags (YFLAGS, or AM_YFLAGS with Automake).

** Bug fixes

*** The epilogue is no longer affected by internal #defines (glr.c)

 The glr.c skeleton uses defines such as #define yylval (yystackp->yyval) in
 generated code.  These weren't properly undefined before the inclusion of
 the user epilogue, so functions such as the following were butchered by the
 preprocessor expansion:

   int yylex (YYSTYPE *yylval);

 This is fixed: yylval, yynerrs, yychar, and yylloc are now valid
 identifiers for user-provided variables.

*** stdio.h is no longer needed when locations are enabled (yacc.c)

 Changes in Bison 2.7 introduced a dependency on FILE and fprintf when
 locations are enabled.  This is fixed.

*** Warnings about useless %pure-parser/%define api.pure are restored

** Diagnostics reported by Bison

 Most of these features were contributed by Théophile Ranquet and Victor
 Santet.

*** Carets

 Version 2.7 introduced caret errors, for a prettier output.  These are now
 activated by default.  The old format can still be used by invoking Bison
 with -fno-caret (or -fnone).

 Some error messages that reproduced excerpts of the grammar are now using
 the caret information only.  For instance on:

   %%
   exp: 'a' | 'a';

 Bison 2.7 reports:

   in.y: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
   in.y:2.12-14: warning: rule useless in parser due to conflicts: exp: 'a' 
[-Wother]

 Now bison reports:

   in.y: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
   in.y:2.12-14: warning: rule useless in parser due to conflicts [-Wother]
    exp: 'a' | 'a';
               ^^^

 and "bison -fno-caret" reports:

   in.y: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
   in.y:2.12-14: warning: rule useless in parser due to conflicts [-Wother]

*** Enhancements of the -Werror option

 The -Werror=CATEGORY option is now recognized, and will treat specified
 warnings as errors. The warnings need not have been explicitly activated
 using the -W option, this is similar to what GCC 4.7 does.

 For example, given the following command line, Bison will treat both
 warnings related to POSIX Yacc incompatibilities and S/R conflicts as
 errors (and only those):

   $ bison -Werror=yacc,error=conflicts-sr input.y

 If no categories are specified, -Werror will make all active warnings into
 errors. For example, the following line does the same the previous example:

   $ bison -Werror -Wnone -Wyacc -Wconflicts-sr input.y

 (By default -Wconflicts-sr,conflicts-rr,deprecated,other is enabled.)

 Note that the categories in this -Werror option may not be prefixed with
 "no-". However, -Wno-error[=CATEGORY] is valid.

 Note that -y enables -Werror=yacc. Therefore it is now possible to require
 Yacc-like behavior (e.g., always generate y.tab.c), but to report
 incompatibilities as warnings: "-y -Wno-error=yacc".

*** The display of warnings is now richer

 The option that controls a given warning is now displayed:

   foo.y:4.6: warning: type clash on default action: <foo> != <bar> [-Wother]

 In the case of warnings treated as errors, the prefix is changed from
 "warning: " to "error: ", and the suffix is displayed, in a manner similar
 to GCC, as [-Werror=CATEGORY].

 For instance, where the previous version of Bison would report (and exit
 with failure):

   bison: warnings being treated as errors
   input.y:1.1: warning: stray ',' treated as white space

 it now reports:

   input.y:1.1: error: stray ',' treated as white space [-Werror=other]

*** Deprecated constructs

 The new 'deprecated' warning category flags obsolete constructs whose
 support will be discontinued.  It is enabled by default.  These warnings
 used to be reported as 'other' warnings.

*** Useless semantic types

 Bison now warns about useless (uninhabited) semantic types.  Since
 semantic types are not declared to Bison (they are defined in the opaque
 %union structure), it is %printer/%destructor directives about useless
 types that trigger the warning:

   %token <type1> term
   %type  <type2> nterm
   %printer    {} <type1> <type3>
   %destructor {} <type2> <type4>
   %%
   nterm: term { $$ = $1; };

   3.28-34: warning: type <type3> is used, but is not associated to any symbol
   4.28-34: warning: type <type4> is used, but is not associated to any symbol

*** Undefined but unused symbols

 Bison used to raise an error for undefined symbols that are not used in
 the grammar.  This is now only a warning.

   %printer    {} symbol1
   %destructor {} symbol2
   %type <type>   symbol3
   %%
   exp: "a";

*** Useless destructors or printers

 Bison now warns about useless destructors or printers.  In the following
 example, the printer for <type1>, and the destructor for <type2> are
 useless: all symbols of <type1> (token1) already have a printer, and all
 symbols of type <type2> (token2) already have a destructor.

   %token <type1> token1
          <type2> token2
          <type3> token3
          <type4> token4
   %printer    {} token1 <type1> <type3>
   %destructor {} token2 <type2> <type4>

*** Conflicts

 The warnings and error messages about shift/reduce and reduce/reduce
 conflicts have been normalized.  For instance on the following foo.y file:

   %glr-parser
   %%
   exp: exp '+' exp | '0' | '0';

 compare the previous version of bison:

   $ bison foo.y
   foo.y: conflicts: 1 shift/reduce, 2 reduce/reduce
   $ bison -Werror foo.y
   bison: warnings being treated as errors
   foo.y: conflicts: 1 shift/reduce, 2 reduce/reduce

 with the new behavior:

   $ bison foo.y
   foo.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
   foo.y: warning: 2 reduce/reduce conflicts [-Wconflicts-rr]
   $ bison -Werror foo.y
   foo.y: error: 1 shift/reduce conflict [-Werror=conflicts-sr]
   foo.y: error: 2 reduce/reduce conflicts [-Werror=conflicts-rr]

 When %expect or %expect-rr is used, such as with bar.y:

   %expect 0
   %glr-parser
   %%
   exp: exp '+' exp | '0' | '0';

 Former behavior:

   $ bison bar.y
   bar.y: conflicts: 1 shift/reduce, 2 reduce/reduce
   bar.y: expected 0 shift/reduce conflicts
   bar.y: expected 0 reduce/reduce conflicts

 New one:

   $ bison bar.y
   bar.y: error: shift/reduce conflicts: 1 found, 0 expected
   bar.y: error: reduce/reduce conflicts: 2 found, 0 expected

** Incompatibilities with POSIX Yacc

 The 'yacc' category is no longer part of '-Wall', enable it explicitly
 with '-Wyacc'.

** Additional yylex/yyparse arguments

 The new directive %param declares additional arguments to both yylex and
 yyparse.  The %lex-param, %parse-param, and %param directives support one
 or more arguments.  Instead of

   %lex-param   {arg1_type *arg1}
   %lex-param   {arg2_type *arg2}
   %parse-param {arg1_type *arg1}
   %parse-param {arg2_type *arg2}

 one may now declare

   %param {arg1_type *arg1} {arg2_type *arg2}

** Types of values for %define variables

 Bison used to make no difference between '%define foo bar' and '%define
 foo "bar"'.  The former is now called a 'keyword value', and the latter a
 'string value'.  A third kind was added: 'code values', such as '%define
 foo {bar}'.

 Keyword variables are used for fixed value sets, e.g.,

   %define lr.type lalr

 Code variables are used for value in the target language, e.g.,

   %define api.value.type {struct semantic_type}

 String variables are used remaining cases, e.g. file names.

** Variable api.token.prefix

 The variable api.token.prefix changes the way tokens are identified in
 the generated files.  This is especially useful to avoid collisions
 with identifiers in the target language.  For instance

   %token FILE for ERROR
   %define api.token.prefix {TOK_}
   %%
   start: FILE for ERROR;

 will generate the definition of the symbols TOK_FILE, TOK_for, and
 TOK_ERROR in the generated sources.  In particular, the scanner must
 use these prefixed token names, although the grammar itself still
 uses the short names (as in the sample rule given above).

** Variable api.value.type

 This new %define variable supersedes the #define macro YYSTYPE.  The use
 of YYSTYPE is discouraged.  In particular, #defining YYSTYPE *and* either
 using %union or %defining api.value.type results in undefined behavior.

 Either define api.value.type, or use "%union":

   %union
   {
     int ival;
     char *sval;
   }
   %token <ival> INT "integer"
   %token <sval> STRING "string"
   %printer { fprintf (yyo, "%d", $$); } <ival>
   %destructor { free ($$); } <sval>

   /* In yylex().  */
   yylval.ival = 42; return INT;
   yylval.sval = "42"; return STRING;

 The %define variable api.value.type supports both keyword and code values.

 The keyword value 'union' means that the user provides genuine types, not
 union member names such as "ival" and "sval" above (WARNING: will fail if
 -y/--yacc/%yacc is enabled).

   %define api.value.type union
   %token <int> INT "integer"
   %token <char *> STRING "string"
   %printer { fprintf (yyo, "%d", $$); } <int>
   %destructor { free ($$); } <char *>

   /* In yylex().  */
   yylval.INT = 42; return INT;
   yylval.STRING = "42"; return STRING;

 The keyword value variant is somewhat equivalent, but for C++ special
 provision is made to allow classes to be used (more about this below).

   %define api.value.type variant
   %token <int> INT "integer"
   %token <std::string> STRING "string"

 Code values (in braces) denote user defined types.  This is where YYSTYPE
 used to be used.

   %code requires
   {
     struct my_value
     {
       enum
       {
         is_int, is_string
       } kind;
       union
       {
         int ival;
         char *sval;
       } u;
     };
   }
   %define api.value.type {struct my_value}
   %token <u.ival> INT "integer"
   %token <u.sval> STRING "string"
   %printer { fprintf (yyo, "%d", $$); } <u.ival>
   %destructor { free ($$); } <u.sval>

   /* In yylex().  */
   yylval.u.ival = 42; return INT;
   yylval.u.sval = "42"; return STRING;

** Variable parse.error

 This variable controls the verbosity of error messages.  The use of the
 %error-verbose directive is deprecated in favor of "%define parse.error
 verbose".

** Renamed %define variables

 The following variables have been renamed for consistency.  Backward
 compatibility is ensured, but upgrading is recommended.

   lr.default-reductions      -> lr.default-reduction
   lr.keep-unreachable-states -> lr.keep-unreachable-state
   namespace                  -> api.namespace
   stype                      -> api.value.type

** Semantic predicates

 Contributed by Paul Hilfinger.

 The new, experimental, semantic-predicate feature allows actions of the
 form "%?{ BOOLEAN-EXPRESSION }", which cause syntax errors (as for
 YYERROR) if the expression evaluates to 0, and are evaluated immediately
 in GLR parsers, rather than being deferred.  The result is that they allow
 the programmer to prune possible parses based on the values of run-time
 expressions.

** The directive %expect-rr is now an error in non GLR mode

 It used to be an error only if used in non GLR mode, _and_ if there are
 reduce/reduce conflicts.

** Tokens are numbered in their order of appearance

 Contributed by Valentin Tolmer.

 With '%token A B', A had a number less than the one of B.  However,
 precedence declarations used to generate a reversed order.  This is now
 fixed, and introducing tokens with any of %token, %left, %right,
 %precedence, or %nonassoc yields the same result.

 When mixing declarations of tokens with a litteral character (e.g., 'a')
 or with an identifier (e.g., B) in a precedence declaration, Bison
 numbered the litteral characters first.  For example

   %right A B 'c' 'd'

 would lead to the tokens declared in this order: 'c' 'd' A B.  Again, the
 input order is now preserved.

 These changes were made so that one can remove useless precedence and
 associativity declarations (i.e., map %nonassoc, %left or %right to
 %precedence, or to %token) and get exactly the same output.

** Useless precedence and associativity

 Contributed by Valentin Tolmer.

 When developing and maintaining a grammar, useless associativity and
 precedence directives are common.  They can be a nuisance: new ambiguities
 arising are sometimes masked because their conflicts are resolved due to
 the extra precedence or associativity information.  Furthermore, it can
 hinder the comprehension of a new grammar: one will wonder about the role
 of a precedence, where in fact it is useless.  The following changes aim
 at detecting and reporting these extra directives.

*** Precedence warning category

 A new category of warning, -Wprecedence, was introduced. It flags the
 useless precedence and associativity directives.

*** Useless associativity

 Bison now warns about symbols with a declared associativity that is never
 used to resolve conflicts.  In that case, using %precedence is sufficient;
 the parsing tables will remain unchanged.  Solving these warnings may raise
 useless precedence warnings, as the symbols no longer have associativity.
 For example:

   %left '+'
   %left '*'
   %%
   exp:
     "number"
   | exp '+' "number"
   | exp '*' exp
   ;

 will produce a

   warning: useless associativity for '+', use %precedence [-Wprecedence]
    %left '+'
          ^^^

*** Useless precedence

 Bison now warns about symbols with a declared precedence and no declared
 associativity (i.e., declared with %precedence), and whose precedence is
 never used.  In that case, the symbol can be safely declared with %token
 instead, without modifying the parsing tables.  For example:

   %precedence '='
   %%
   exp: "var" '=' "number";

 will produce a

   warning: useless precedence for '=' [-Wprecedence]
    %precedence '='
                ^^^

*** Useless precedence and associativity

 In case of both useless precedence and associativity, the issue is flagged
 as follows:

   %nonassoc '='
   %%
   exp: "var" '=' "number";

 The warning is:

   warning: useless precedence and associativity for '=' [-Wprecedence]
    %nonassoc '='
              ^^^

** Empty rules

 With help from Joel E. Denny and Gabriel Rassoul.

 Empty rules (i.e., with an empty right-hand side) can now be explicitly
 marked by the new %empty directive.  Using %empty on a non-empty rule is
 an error.  The new -Wempty-rule warning reports empty rules without
 %empty.  On the following grammar:

   %%
   s: a b c;
   a: ;
   b: %empty;
   c: 'a' %empty;

 bison reports:

   3.4-5: warning: empty rule without %empty [-Wempty-rule]
    a: {}
       ^^
   5.8-13: error: %empty on non-empty rule
    c: 'a' %empty {};
           ^^^^^^

** Java skeleton improvements

 The constants for token names were moved to the Lexer interface.  Also, it
 is possible to add code to the parser's constructors using "%code init"
 and "%define init_throws".
 Contributed by Paolo Bonzini.

 The Java skeleton now supports push parsing.
 Contributed by Dennis Heimbigner.

** C++ skeletons improvements

*** The parser header is no longer mandatory (lalr1.cc, glr.cc)

 Using %defines is now optional.  Without it, the needed support classes
 are defined in the generated parser, instead of additional files (such as
 location.hh, position.hh and stack.hh).

*** Locations are no longer mandatory (lalr1.cc, glr.cc)

 Both lalr1.cc and glr.cc no longer require %location.

*** syntax_error exception (lalr1.cc)

 The C++ parser features a syntax_error exception, which can be
 thrown from the scanner or from user rules to raise syntax errors.
 This facilitates reporting errors caught in sub-functions (e.g.,
 rejecting too large integral literals from a conversion function
 used by the scanner, or rejecting invalid combinations from a
 factory invoked by the user actions).

*** %define api.value.type variant

 This is based on a submission from Michiel De Wilde.  With help
 from Théophile Ranquet.

 In this mode, complex C++ objects can be used as semantic values.  For
 instance:

   %token <::std::string> TEXT;
   %token <int> NUMBER;
   %token SEMICOLON ";"
   %type <::std::string> item;
   %type <::std::list<std::string>> list;
   %%
   result:
     list  { std::cout << $1 << std::endl; }
   ;

   list:
     %empty        { /* Generates an empty string list. */ }
   | list item ";" { std::swap ($$, $1); $$.push_back ($2); }
   ;

   item:
     TEXT    { std::swap ($$, $1); }
   | NUMBER  { $$ = string_cast ($1); }
   ;

*** %define api.token.constructor

 When variants are enabled, Bison can generate functions to build the
 tokens.  This guarantees that the token type (e.g., NUMBER) is consistent
 with the semantic value (e.g., int):

   parser::symbol_type yylex ()
   {
     parser::location_type loc = ...;
     ...
     return parser::make_TEXT ("Hello, world!", loc);
     ...
     return parser::make_NUMBER (42, loc);
     ...
     return parser::make_SEMICOLON (loc);
     ...
   }

*** C++ locations

 There are operator- and operator-= for 'location'.  Negative line/column
 increments can no longer underflow the resulting value.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]