bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

yacc.c, glr.c: fix crash when reporting errors in consistent states


From: Akim Demaille
Subject: yacc.c, glr.c: fix crash when reporting errors in consistent states
Date: Fri, 29 Nov 2019 18:26:09 +0100

I have been chasing that guys for quite a while now.  _Many_ thanks to Joel for 
forging a grammar that exhibits this bug.


commit 2f7097d1b1414fe8ac28e128c1e12b440f54b2f2
Author: Akim Demaille <address@hidden>
Date:   Wed Nov 27 08:24:58 2019 +0100

    yacc.c, glr.c: fix crash when reporting errors in consistent states
    
    The current code for yysyntax_error for %define parse.error verbose is
    fishy (given that YYEMPTY is -2, invalid argument for yytname[]):
    
        static int
        yysyntax_error ([...])
        {
          YYPTRDIFF_T yysize0 = yytnamerr (YY_NULLPTR, yytname[yytoken]);
        [...]
          if (yytoken != YYEMPTY)
    
    A nearby comment reports
    
        The only way there can be no lookahead present (in yychar) is if
        this state is a consistent state with a default action.  Thus,
        detecting the absence of a lookahead is sufficient to determine
        that there is no unexpected or expected token to report.  In that
        case, just report a simple "syntax error".
    
    So it _is_ possible to call yysyntax_error with yytoken == YYEMPTY,
    albeit quite difficult when meaning to, so virtually impossible by
    accident (after all, there was never a bug report about this).
    
    I failed to produce a test case, but Joel E. Denny provided me with
    one (added to the test suite below).  The yacc.c skeleton fails on
    this, and once fixed dies on a second problem.  The glr.c skeleton was
    also dying, but immediately of this second problem.
    
    Indeed we were not allocating space for the error message's final \0.
    This was hidden by the fact that we only had error messages with at
    least an unexpected token displayed, so with at least one "%s" in the
    format string, whose size (2) was included (incorrectly) in the final
    size of the message (where the %s have been replaced by the actual
    content).
    
    * data/skeletons/glr.c, data/skeletons/yacc.c (yysyntax_error):
    Do not invoke yytnamerr on YYEMPTY.
    Clarify the computation of the length of the _final_ error message,
    with the NUL terminator but without the '%s's.
    * tests/conflicts.at (Syntax error in consistent error state):
    New, contributed by Joel E. Denny.

diff --git a/data/skeletons/glr.c b/data/skeletons/glr.c
index a3a9a711..8603d51f 100644
--- a/data/skeletons/glr.c
+++ b/data/skeletons/glr.c
@@ -2071,18 +2071,18 @@ yyreportSyntaxError (yyGLRStack* 
yystackp]b4_user_formals[)
 #else
   {
   yySymbol yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar);
-  ptrdiff_t yysize0 = yytnamerr (YY_NULLPTR, yytokenName (yytoken));
-  ptrdiff_t yysize = yysize0;
   yybool yysize_overflow = yyfalse;
   char* yymsg = YY_NULLPTR;
   enum { YYERROR_VERBOSE_ARGS_MAXIMUM = 5 };
   /* Internationalized format string. */
   const char *yyformat = YY_NULLPTR;
-  /* Arguments of yyformat. */
+  /* Arguments of yyformat: reported tokens (one for the "unexpected",
+     one per "expected"). */
   char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM];
-  /* Number of reported tokens (one for the "unexpected", one per
-     "expected").  */
+  /* Actual size of YYARG. */
   int yycount = 0;
+  /* Cumulated lengths of YYARG.  */
+  ptrdiff_t yysize = 0;
 
   /* There are many possibilities here to consider:
      - If this state is a consistent state with a default action, then
@@ -2110,6 +2110,8 @@ yyreportSyntaxError (yyGLRStack* 
yystackp]b4_user_formals[)
   if (yytoken != YYEMPTY)
     {
       int yyn = yypact[yystackp->yytops.yystates[0]->yylrState];
+      ptrdiff_t yysize0 = yytnamerr (YY_NULLPTR, yytokenName (yytoken));
+      yysize = yysize0;
       yyarg[yycount++] = yytokenName (yytoken);
       if (!yypact_value_is_default (yyn))
         {
@@ -2160,7 +2162,9 @@ yyreportSyntaxError (yyGLRStack* 
yystackp]b4_user_formals[)
     }
 
   {
-    ptrdiff_t yysz = YY_CAST (ptrdiff_t, strlen (yyformat));
+    /* Don't count the "%s"s in the final size, but reserve room for
+       the terminator.  */
+    ptrdiff_t yysz = YY_CAST (ptrdiff_t, strlen (yyformat)) - 2 * yycount + 1;
     if (YYSIZEMAX - yysize < yysz)
       yysize_overflow = yytrue;
     else
@@ -2183,8 +2187,8 @@ yyreportSyntaxError (yyGLRStack* 
yystackp]b4_user_formals[)
             }
           else
             {
-              yyp++;
-              yyformat++;
+              ++yyp;
+              ++yyformat;
             }
         }
       yyerror (]b4_lyyerror_args[yymsg);
diff --git a/data/skeletons/yacc.c b/data/skeletons/yacc.c
index e970189c..3eff66dd 100644
--- a/data/skeletons/yacc.c
+++ b/data/skeletons/yacc.c
@@ -1137,16 +1137,16 @@ yysyntax_error (YYPTRDIFF_T *yymsg_alloc, char **yymsg,
                 ]b4_lac_if([[yy_state_t *yyesa, yy_state_t **yyes,
                 YYPTRDIFF_T *yyes_capacity, ]])[yy_state_t *yyssp, int yytoken)
 {
-  YYPTRDIFF_T yysize0 = yytnamerr (YY_NULLPTR, yytname[yytoken]);
-  YYPTRDIFF_T yysize = yysize0;
   enum { YYERROR_VERBOSE_ARGS_MAXIMUM = 5 };
   /* Internationalized format string. */
   const char *yyformat = YY_NULLPTR;
-  /* Arguments of yyformat. */
+  /* Arguments of yyformat: reported tokens (one for the "unexpected",
+     one per "expected"). */
   char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM];
-  /* Number of reported tokens (one for the "unexpected", one per
-     "expected"). */
+  /* Actual size of YYARG. */
   int yycount = 0;
+  /* Cumulated lengths of YYARG.  */
+  YYPTRDIFF_T yysize = 0;
 
   /* There are many possibilities here to consider:
      - If this state is a consistent state with a default action, then
@@ -1178,7 +1178,9 @@ yysyntax_error (YYPTRDIFF_T *yymsg_alloc, char **yymsg,
   */
   if (yytoken != YYEMPTY)
     {
-      int yyn = yypact[*yyssp];]b4_lac_if([[
+      int yyn = yypact[*yyssp];
+      YYPTRDIFF_T yysize0 = yytnamerr (YY_NULLPTR, yytname[yytoken]);
+      yysize = yysize0;]b4_lac_if([[
       YYDPRINTF ((stderr, "Constructing syntax error message\n"));]])[
       yyarg[yycount++] = yytname[yytoken];
       if (!yypact_value_is_default (yyn))
@@ -1217,8 +1219,8 @@ yysyntax_error (YYPTRDIFF_T *yymsg_alloc, char **yymsg,
                   }
                 yyarg[yycount++] = yytname[yyx];
                 {
-                  YYPTRDIFF_T yysize1 = yysize + yytnamerr (YY_NULLPTR,
-                                                            yytname[yyx]);
+                  YYPTRDIFF_T yysize1
+                    = yysize + yytnamerr (YY_NULLPTR, yytname[yyx]);
                   if (yysize <= yysize1 && yysize1 <= YYSTACK_ALLOC_MAXIMUM)
                     yysize = yysize1;
                   else
@@ -1249,7 +1251,9 @@ yysyntax_error (YYPTRDIFF_T *yymsg_alloc, char **yymsg,
     }
 
   {
-    YYPTRDIFF_T yysize1 = yysize + yystrlen (yyformat);
+    /* Don't count the "%s"s in the final size, but reserve room for
+       the terminator.  */
+    YYPTRDIFF_T yysize1 = yysize + (yystrlen (yyformat) - 2 * yycount) + 1;
     if (yysize <= yysize1 && yysize1 <= YYSTACK_ALLOC_MAXIMUM)
       yysize = yysize1;
     else
@@ -1279,8 +1283,8 @@ yysyntax_error (YYPTRDIFF_T *yymsg_alloc, char **yymsg,
         }
       else
         {
-          yyp++;
-          yyformat++;
+          ++yyp;
+          ++yyformat;
         }
   }
   return 0;
diff --git a/tests/actions.at b/tests/actions.at
index 34325c7a..c702373c 100644
--- a/tests/actions.at
+++ b/tests/actions.at
@@ -473,7 +473,7 @@ AT_BISON_OPTION_POPDEFS
 AT_CLEANUP
 ])
 
-## FIXME: test Java.
+## FIXME: test Java and D.
 m4_map_args([AT_TEST], [yacc.c], [glr.c], [lalr1.cc], [glr.cc])
 
 m4_popdef([AT_TEST])
diff --git a/tests/conflicts.at b/tests/conflicts.at
index e857bf20..92050fbe 100644
--- a/tests/conflicts.at
+++ b/tests/conflicts.at
@@ -1019,6 +1019,66 @@ input.y:12.3-18: warning: rule useless in parser due to 
conflicts [-Wother]
 AT_CLEANUP
 
 
+## ---------------------------------------- ##
+## Syntax error in consistent error state.  ##
+## ---------------------------------------- ##
+
+# AT_TEST(SKELETON-NAME)
+# ----------------------
+# Make sure yysyntax_error does nothing silly when called on yytoken
+# == YYEMPTY.
+
+m4_pushdef([AT_TEST],
+[AT_SETUP([Syntax error in consistent error state: $1])
+
+AT_BISON_OPTION_PUSHDEFS([%skeleton "$1"])
+
+AT_DATA_GRAMMAR([input.y],
+[[%define parse.error verbose
+%skeleton "$1"
+%%
+%nonassoc 'a';
+
+start: 'a' consistent-error-on-a-a 'a';
+
+consistent-error-on-a-a:
+    'a' default-reduction
+  | 'a' default-reduction 'a'
+  ;
+
+default-reduction: %empty;
+
+%code {
+  #include <stdio.h>
+  ]AT_YYERROR_DECLARE[
+  ]AT_YYLEX_DECLARE[
+};
+%%
+]AT_YYERROR_DEFINE[
+]AT_YYLEX_DEFINE("aa")[
+]AT_MAIN_DEFINE[
+]])
+
+AT_BISON_CHECK([-o input.AT_LANG_EXT input.y], 0, [],
+[[input.y:17.5-25: warning: rule useless in parser due to conflicts [-Wother]
+input.y:18.5-29: warning: rule useless in parser due to conflicts [-Wother]
+]])
+AT_LANG_COMPILE([input])
+AT_PARSER_CHECK([[input]], 1, [],
+[[syntax error
+]])
+
+AT_BISON_OPTION_POPDEFS
+AT_CLEANUP
+])
+
+## FIXME: test Java and D.
+m4_map_args([AT_TEST], [yacc.c], [glr.c], [lalr1.cc], [glr.cc])
+
+m4_popdef([AT_TEST])
+
+
+
 ## -------------------------------- ##
 ## Defaulted Conflicted Reduction.  ##
 ## -------------------------------- ##




reply via email to

[Prev in Thread] Current Thread [Next in Thread]