help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Exception safety of generated C++ parser


From: Akim Demaille
Subject: Re: Exception safety of generated C++ parser
Date: Wed, 26 Sep 2012 09:20:27 +0200

Hi Oleksii,

Le 24 sept. 2012 à 16:55, Akim Demaille a écrit :

> The more I think about it, the more I believe that I took the wrong
> path, and exception safety should be addressed at the yyparse level:
> put the whole function into a try/catch block.  That way, we are
> also protected from errors in %printer, %destructor, %initial-action,
> yyerror, and maybe other places I didn't think about.  I will provide
> another patch in the near future.

There is a second branch, maint-exception-safety-2, that implements
that.  I'd be interested in feedback.  In particular if someone has
an opinion on which approach is the best one.

I don't enforce exception safety when an exception is thrown from
a %destructor: currently it results in an exception being thrown
while another one is already in flight (because when we catch an
exception, even thrown from a %destructor, we will clean the
stack, which will certainly call again the bad %destructor).

In that situation I could not clean the stack and lookahead.

I'd rather make clear (in the documentation) that %destructors
must not throw exception, just like for genuine C++ destructors.
In master I'm getting to a point where the %destructors would
serve to generate a real C++ destructor for the "symbol" objects,
that wrap kind/value/location together.  So the C++ rule would
simply apply to Bison's symbol: don't throw on destruction.

These are the relevant patches:

From f5c253a0980ca2e6c91d6c052f0357d0412089ab Mon Sep 17 00:00:00 2001
From: Akim Demaille <address@hidden>
Date: Thu, 20 Sep 2012 16:59:29 +0200
Subject: [PATCH 1/4] lalr1.cc: fix exception safety

lalr1.cc does not reclaim its memory when ended by an exception.

Reported by Oleksii Taran:
http://lists.gnu.org/archive/html/help-bison/2012-09/msg00000.html

* data/lalr1.cc (yyparse): Protect the whole yyparse by a try-catch
block that cleans the stack and the lookahead.
---
 THANKS        |  1 +
 data/lalr1.cc | 45 ++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/THANKS b/THANKS
index 95acb1e..ee9ff66 100644
--- a/THANKS
+++ b/THANKS
@@ -80,6 +80,7 @@ Nicolas Tisserand         address@hidden
 Noah Friedman             address@hidden
 Odd Arild Olsen           address@hidden
 Oleg Smolsky              address@hidden
+Oleksii Taran             address@hidden
 Paolo Bonzini             address@hidden
 Pascal Bart               address@hidden
 Paul Eggert               address@hidden
diff --git a/data/lalr1.cc b/data/lalr1.cc
index 846aea1..fe23b3e 100644
--- a/data/lalr1.cc
+++ b/data/lalr1.cc
@@ -533,6 +533,10 @@ do {                                       \
 
     int yyresult;
 
+    // FIXME: This shoud be completely indented.  It is not yet to
+    // avoid gratuitous conflicts when merging into the master branch.
+    try
+      {
     YYCDEBUG << "Starting parse" << std::endl;
 
 ]m4_ifdef([b4_initial_action], [
@@ -573,14 +577,13 @@ b4_dollar_popdef])[]dnl
     /* Read a lookahead token.  */
     if (yychar == yyempty_)
       {
-       YYCDEBUG << "Reading a token: ";
-       yychar = ]b4_c_function_call([yylex], [int],
-                                    [b4_api_PREFIX[STYPE*], [&yylval]][]dnl
+        YYCDEBUG << "Reading a token: ";
+        yychar = ]b4_c_function_call([yylex], [int],
+                                     [b4_api_PREFIX[STYPE*], [&yylval]][]dnl
 b4_locations_if([, [[location*], [&yylloc]]])dnl
 m4_ifdef([b4_lex_param], [, ]b4_lex_param))[;
       }
 
-
     /* Convert token to internal form.  */
     if (yychar <= yyeof_)
       {
@@ -651,17 +654,21 @@ m4_ifdef([b4_lex_param], [, ]b4_lex_param))[;
     else
       yyval = yysemantic_stack_[0];
 
+    // Compute the default @@$.
     {
       slice<location_type, location_stack_type> slice (yylocation_stack_, 
yylen);
       YYLLOC_DEFAULT (yyloc, slice, yylen);
     }
+
+    // Perform the reduction.
     YY_REDUCE_PRINT (yyn);
     switch (yyn)
       {
-       ]b4_user_actions[
-       default:
-          break;
+        ]b4_user_actions[
+      default:
+        break;
       }
+
     /* User semantic actions sometimes alter yychar, and that requires
        that yytoken be updated with the new translation.  We take the
        approach of translating immediately before every use of yytoken.
@@ -831,6 +838,30 @@ m4_ifdef([b4_lex_param], [, ]b4_lex_param))[;
       }
 
     return yyresult;
+    }
+    catch (...)
+      {
+        YYCDEBUG << "Exception caught" << std::endl;
+        if (yychar != yyempty_)
+          {
+            /* Make sure we have latest lookahead translation.  See
+               comments at user semantic actions for why this is
+               necessary.  */
+            yytoken = yytranslate_ (yychar);
+            yydestruct_ ("Cleanup: discarding lookahead", yytoken, &yylval,
+                         &yylloc);
+          }
+
+        while (yystate_stack_.height () != 1)
+          {
+            yydestruct_ ("Cleanup: popping",
+                         yystos_[yystate_stack_[0]],
+                         &yysemantic_stack_[0],
+                         &yylocation_stack_[0]);
+            yypop_ ();
+          }
+        throw;
+      }
   }
 
   // Generate an error message.
-- 
1.7.12.1

From 9acdd4ee734dbeaa1ca268d4f8845a3a7b3205e2 Mon Sep 17 00:00:00 2001
From: Akim Demaille <address@hidden>
Date: Tue, 25 Sep 2012 11:17:55 +0200
Subject: [PATCH 2/4] lalr1.cc: check (and fix) %initial-action exception
 safety

* data/lalr1.cc: Check size > 1, rather than size != 1, when cleaning
the stack, as at the beginning, size is 0.
* tests/c++.at (Exception safety): Check exception safety in
%initial-action.
---
 data/lalr1.cc |  4 ++--
 tests/c++.at  | 24 +++++++++++++++++++-----
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/data/lalr1.cc b/data/lalr1.cc
index fe23b3e..2ec1d0d 100644
--- a/data/lalr1.cc
+++ b/data/lalr1.cc
@@ -828,7 +828,7 @@ m4_ifdef([b4_lex_param], [, ]b4_lex_param))[;
     /* Do not reclaim the symbols of the rule which action triggered
        this YYABORT or YYACCEPT.  */
     yypop_ (yylen);
-    while (yystate_stack_.height () != 1)
+    while (1 < yystate_stack_.height ())
       {
         yydestruct_ ("Cleanup: popping",
                      yystos_[yystate_stack_[0]],
@@ -852,7 +852,7 @@ m4_ifdef([b4_lex_param], [, ]b4_lex_param))[;
                          &yylloc);
           }
 
-        while (yystate_stack_.height () != 1)
+        while (1 < yystate_stack_.height ())
           {
             yydestruct_ ("Cleanup: popping",
                          yystos_[yystate_stack_[0]],
diff --git a/tests/c++.at b/tests/c++.at
index 304a72d..e4a7911 100644
--- a/tests/c++.at
+++ b/tests/c++.at
@@ -226,6 +226,7 @@ AT_DATA_GRAMMAR([[input.yy]],
 %code
 {
   #include <cassert>
+  #include <cstring> // strchr
   #include <stdexcept>
   int yylex (yy::parser::semantic_type *);
   size_t Object::counter = 0;
@@ -237,6 +238,12 @@ AT_DATA_GRAMMAR([[input.yy]],
   Object* obj;
 }
 
+%initial-action
+{
+  if (strchr (input, 'i'))
+    throw std::runtime_error ("initial-action");
+}
+
 %destructor { delete $$; } <obj>;
 %printer { yyo << "counter == " << $$->counter; } <obj>;
 
@@ -260,7 +267,7 @@ item:
 | 's'
   {
     std::swap ($$, $1);
-    throw std::runtime_error ("invalid expression");
+    throw std::runtime_error ("reduction");
   }
 
 %%
@@ -268,11 +275,14 @@ item:
 int
 yylex (yy::parser::semantic_type *lvalp)
 {
-  // 'l': lexical exception, 's': syntactic exception.
+  // 'a': no error.
+  // 'i': initial action throws.
+  // 'l': yylex throws.
+  // 's': reduction throws.
   switch (int res = *input++)
   {
     case 'l':
-      throw std::runtime_error ("invalid character");
+      throw std::runtime_error ("yylex");
     default:
       lvalp->obj = new Object;
       // Fall through.
@@ -312,11 +322,15 @@ AT_BISON_CHECK([[-o input.cc input.yy]])
 AT_COMPILE_CXX([[input]])
 
 AT_PARSER_CHECK([[./input aaaas]], [[2]], [[]],
-[[exception caught: invalid expression
+[[exception caught: reduction
 ]])
 
 AT_PARSER_CHECK([[./input aaaal]], [[2]], [[]],
-[[exception caught: invalid character
+[[exception caught: yylex
+]])
+
+AT_PARSER_CHECK([[./input i]], [[2]], [[]],
+[[exception caught: initial-action
 ]])
 
 AT_BISON_OPTION_POPDEFS
-- 
1.7.12.1

From d4d5705ace1c5da3ff16ea16beac9c731d87137b Mon Sep 17 00:00:00 2001
From: Akim Demaille <address@hidden>
Date: Tue, 25 Sep 2012 11:41:22 +0200
Subject: [PATCH 3/4] lalr1.cc: check (and fix) %printer exception safety

* tests/c++.at (Exception safety): Let the parser support the --debug
option.
On 'p', throw an exception from the %printer.
* data/lalr1.cc (yyparse): Do not display the values we discard, as it
uses %printer, which might have thrown the exception.
---
 data/lalr1.cc | 14 +++++++++-----
 tests/c++.at  | 47 +++++++++++++++++++++++++++++++++++------------
 2 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/data/lalr1.cc b/data/lalr1.cc
index 2ec1d0d..8d5eafa 100644
--- a/data/lalr1.cc
+++ b/data/lalr1.cc
@@ -227,6 +227,7 @@ b4_user_stype
 
     /// \brief Reclaim the memory associated to a symbol.
     /// \param yymsg        Why this token is reclaimed.
+    ///                     If null, do not display the symbol, just free it.
     /// \param yytype       The symbol type.
     /// \param yyvaluep     Its semantic value.
     /// \param yylocationp  Its location.
@@ -446,7 +447,8 @@ do {                                        \
     YYUSE (yymsg);
     YYUSE (yyvaluep);
 
-    YY_SYMBOL_PRINT (yymsg, yytype, yyvaluep, yylocationp);
+    if (yymsg)
+      YY_SYMBOL_PRINT (yymsg, yytype, yyvaluep, yylocationp);
 
     switch (yytype)
       {
@@ -841,20 +843,22 @@ m4_ifdef([b4_lex_param], [, ]b4_lex_param))[;
     }
     catch (...)
       {
-        YYCDEBUG << "Exception caught" << std::endl;
+        YYCDEBUG << "Exception caught: cleaning lookahead and stack"
+                 << std::endl;
+        // Do not try to display the values of the reclaimed symbols,
+        // as their printer might throw an exception.
         if (yychar != yyempty_)
           {
             /* Make sure we have latest lookahead translation.  See
                comments at user semantic actions for why this is
                necessary.  */
             yytoken = yytranslate_ (yychar);
-            yydestruct_ ("Cleanup: discarding lookahead", yytoken, &yylval,
-                         &yylloc);
+            yydestruct_ (NULL, yytoken, &yylval, &yylloc);
           }
 
         while (1 < yystate_stack_.height ())
           {
-            yydestruct_ ("Cleanup: popping",
+            yydestruct_ (NULL,
                          yystos_[yystate_stack_[0]],
                          &yysemantic_stack_[0],
                          &yylocation_stack_[0]);
diff --git a/tests/c++.at b/tests/c++.at
index e4a7911..9a60bfd 100644
--- a/tests/c++.at
+++ b/tests/c++.at
@@ -203,11 +203,14 @@ AT_DATA_GRAMMAR([[input.yy]],
 
   int debug = 0;
 
+  /// A class that counts its number of instances.
   struct Object
   {
     static size_t counter;
+    int val;
 
-    Object ()
+    Object (int v)
+      : val (v)
     {
       ++counter;
       if (debug)
@@ -245,9 +248,14 @@ AT_DATA_GRAMMAR([[input.yy]],
 }
 
 %destructor { delete $$; } <obj>;
-%printer { yyo << "counter == " << $$->counter; } <obj>;
+%printer
+{
+  yyo << "counter == " << $$->counter;
+  if ($$->val == 'p')
+    throw std::runtime_error ("printer");
+} <obj>;
 
-%token <obj> 'a' 's'
+%token <obj> 'a' 'p' 's'
 %type  <obj> list item
 
 %%
@@ -256,14 +264,12 @@ start: list { delete $1; };
 
 list:
   item       { $$ = $1; }
-| item list  { $$ = $1; delete $2; } /* Right recursion to load the stack. */
+| item list  { $$ = $1; delete $2; } // Right recursion to load the stack.
 ;
 
 item:
-  'a'
-  {
-    std::swap ($$, $1);
-  }
+  'a'  { std::swap ($$, $1); }
+| 'p'  { std::swap ($$, $1); }
 | 's'
   {
     std::swap ($$, $1);
@@ -284,7 +290,7 @@ yylex (yy::parser::semantic_type *lvalp)
     case 'l':
       throw std::runtime_error ("yylex");
     default:
-      lvalp->obj = new Object;
+      lvalp->obj = new Object (res);
       // Fall through.
     case 0:
       return res;
@@ -296,10 +302,22 @@ yylex (yy::parser::semantic_type *lvalp)
 int
 main (int argc, const char *argv[])
 {
-  assert (argc == 2);
-  input = argv[1];
+  switch (argc)
+  {
+    case 2:
+      input = argv[1];
+      break;
+    case 3:
+      assert (!strcmp (argv[1], "--debug"));
+      debug = 1;
+      input = argv[2];
+      break;
+    default:
+      abort ();
+  }
+
   yy::parser parser;
-  debug = !!getenv ("YYDEBUG");
+  debug |= !!getenv ("YYDEBUG");
   parser.set_debug_level (debug);
   int res = 2;
   try
@@ -333,6 +351,11 @@ AT_PARSER_CHECK([[./input i]], [[2]], [[]],
 [[exception caught: initial-action
 ]])
 
+AT_PARSER_CHECK([[./input aaaap]])
+
+AT_PARSER_CHECK([[./input --debug aaaap]], [[2]], [[]], [[stderr]])
+AT_PARSER_CHECK([[grep '^exception caught: printer$' stderr]], [], [ignore])
+
 AT_BISON_OPTION_POPDEFS
 
 AT_CLEANUP
-- 
1.7.12.1

From 4b3553b8716c6d3176c64bf3c8821888a3e50180 Mon Sep 17 00:00:00 2001
From: Akim Demaille <address@hidden>
Date: Tue, 25 Sep 2012 14:18:04 +0200
Subject: [PATCH 4/4] lalr1.cc: check exception safety of yyerror

* tests/c++.at: here.
---
 tests/c++.at | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/tests/c++.at b/tests/c++.at
index 9a60bfd..98e0bc9 100644
--- a/tests/c++.at
+++ b/tests/c++.at
@@ -255,7 +255,7 @@ AT_DATA_GRAMMAR([[input.yy]],
     throw std::runtime_error ("printer");
 } <obj>;
 
-%token <obj> 'a' 'p' 's'
+%token <obj> 'a' 'e' 'p' 's'
 %type  <obj> list item
 
 %%
@@ -269,13 +269,10 @@ list:
 
 item:
   'a'  { std::swap ($$, $1); }
+| 'e'  { YYERROR ("yyerror"); }
 | 'p'  { std::swap ($$, $1); }
-| 's'
-  {
-    std::swap ($$, $1);
-    throw std::runtime_error ("reduction");
-  }
-
+| 's'  { std::swap ($$, $1); throw std::runtime_error ("reduction"); }
+;
 %%
 
 int
@@ -297,7 +294,13 @@ yylex (yy::parser::semantic_type *lvalp)
   }
 }
 
-]AT_YYERROR_DEFINE[
+/* A C++ error reporting function.  */
+void
+yy::parser::error (const location_type& l, const std::string& m)
+{
+  (void) l;
+  throw std::runtime_error (m);
+}
 
 int
 main (int argc, const char *argv[])
@@ -356,6 +359,10 @@ AT_PARSER_CHECK([[./input aaaap]])
 AT_PARSER_CHECK([[./input --debug aaaap]], [[2]], [[]], [[stderr]])
 AT_PARSER_CHECK([[grep '^exception caught: printer$' stderr]], [], [ignore])
 
+AT_PARSER_CHECK([[./input aaaae]], [[2]], [[]],
+[[exception caught: syntax error
+]])
+
 AT_BISON_OPTION_POPDEFS
 
 AT_CLEANUP
-- 
1.7.12.1







reply via email to

[Prev in Thread] Current Thread [Next in Thread]