Ye oldest bug in Bison?

bison-patches
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Ye oldest bug in Bison?

From:	Akim Demaille
Subject:	Ye oldest bug in Bison?
Date:	Thu, 8 Nov 2018 22:02:07 +0100
I was working on RR conflicts in the context of Paul Hilfinger’s proposal 
(https://lists.gnu.org/archive/html/bison-patches/2018-11/msg00010.html) when I 
fell on this:

> $ cat /tmp/foo.y
> %%
> exp: "num" | "num" | "num"
> 
> $ LC_ALL=C bison /tmp/foo.y
> /tmp/foo.y: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
> /tmp/foo.y:2.14-18: warning: rule useless in parser due to conflicts [-Wother]
>  exp: "num" | "num" | "num"
>               ^^^^^
> /tmp/foo.y:2.22-26: warning: rule useless in parser due to conflicts [-Wother]
>  exp: "num" | "num" | "num"
>                       ^^^^^

Wow!  *One* RR conflict!  Of course, the *.output file properly shows that 
there are two.

> Rules useless in parser due to conflicts
> 
>     2 exp: "num"
>     3    | "num"
> 
> 
> State 1 conflicts: 1 reduce/reduce
> …
> State 1
> 
>     1 exp: "num" .  [$end]
>     2    | "num" .  [$end]
>     3    | "num" .  [$end]
> 
>     $end      reduce using rule 1 (exp)
>     $end      [reduce using rule 2 (exp)]      <===========
>     $end      [reduce using rule 3 (exp)]      <===========
>     $default  reduce using rule 1 (exp)

I was amazed to see that the code is actually even tailored to feature that 
bug: it has a Boolean to decide whether to compute correctly or not:

> /*----------------------------------------------------------------.
> | Count the number of reduce/reduce conflicts.  If ONE_PER_TOKEN, |
> | count one conflict for each token that has any reduce/reduce    |
> | conflicts.  Otherwise, count one conflict for each reduction    |
> | after the first for a given token.                              |
> `----------------------------------------------------------------*/
> 
> static size_t
> count_state_rr_conflicts (state *s, bool one_per_token)
> {
>   reductions *reds = s->reductions;
>   size_t res = 0;
> 
>   for (symbol_number i = 0; i < ntokens; ++i)
>     {
>       int count = 0;
>       for (int j = 0; j < reds->num; ++j)
>         count += bitset_test (reds->lookahead_tokens[j], i);
>       if (count >= 2)
>         res += one_per_token ? 1 : count-1; // <==============
>     }
> 
>   return res;
> }

That Boolean was introduced by Paul Hilfinger when he introduced GLR to Bison 
to _preserve_ the value that was computed by Bison before.  He needed to also 
be able to count the correct value to be able to compute the proper size for 
the GLR tables.

> commit 676385e29c4aedfc05d20daf1ef20cd4ccc84856
> Author: Paul Hilfinger <address@hidden>
> Date:   Fri Jun 28 02:26:44 2002 +0000
> 
>     Initial check-in introducing experimental GLR parsing.  See entry in
>     ChangeLog dated 2002-06-27 from Paul Hilfinger for details.
> 
> -/*----------------------------------------------.
> -| Count the number of reduce/reduce conflicts.  |
> -`----------------------------------------------*/
> +/*----------------------------------------------------------------.
> +| Count the number of reduce/reduce conflicts.  If ONE_PER_TOKEN, |
> +| count one conflict for each token that has any reduce/reduce    |
> +| conflicts.  Otherwise, count one conflict for each pair of      |
> +| conflicting reductions.                                         |
> ++`----------------------------------------------------------------*/
>  
>  static int
> -count_rr_conflicts (state_t *state)
> +count_rr_conflicts (state_t *state, int one_per_token)
>  {
>    int i;
>    int rrc_count = 0;
> @@ -353,7 +356,7 @@ count_rr_conflicts (state_t *state)
>           count++;
>  
>        if (count >= 2)
> -       rrc_count++;
> +       rrc_count += one_per_token ? 1 : count-1;
>      }
>  
>    return rrc_count;

I venture that Paul thought that this behavior was on purpose, hence he wanted 
to preserve it.  But my take is difference: that was a genuine bug!

That bug is there since the oldest revisions that we have in git:

> commit 08089d5d35ece0c7d41659cc1bc09638e2abb151
> Author: David MacKenzie <address@hidden>
> Date:   Tue Apr 20 05:42:52 1993 +0000
> 
>     Initial revision

(I didn’t remember djm had participated in Bison!)

> …
> +void
> +count_rr_conflicts(state)
> +int state;
> +{
> +  register int i;
> +  register int j;
> +  register int count;
> +  register unsigned mask;
> +  register unsigned *baseword;
> +  register unsigned *wordp;
> +  register int m;
> +  register int n;
> +
> +  rrc_count = 0;
> +
> +  m = lookaheads[state];
> +  n = lookaheads[state + 1];
> +
> +  if (n - m < 2) return;
> +
> +  mask = 1;
> +  baseword = LA + m * tokensetsize;
> +  for (i = 0; i < ntokens; i++)
> +    {
> +      wordp = baseword;
> +
> +      count = 0;
> +      for (j = m; j < n; j++)
> +       {
> +         if (mask & *wordp)
> +           count++;
> +
> +         wordp += tokensetsize;
> +       }
> +
> +      if (count >= 2) rrc_count++;   // <====================
> +
> +      mask <<= 1;
> +      if (mask == 0)
> +       {
> +         mask = 1;
> +         baseword++;
> +       }
> +    }
> +}

(OMG, K&R C was soooo ugly in those days.  The current code is vastly improved.)

I think this is the oldest bug in Bison!  :)


So I propose the patch below to address this.  Note that our own test suite was 
affected.  So I have a problem now: existing grammars out there might have some 
%expect-rr which are incorrect, but matching the expected value by existing 
versions of Bison.  So maybe we can’t apply this patch as is, we need to help 
users transitioning.

We could keep both counts and depend on %require to decide which one to use.  
We could also decide that users should just update their grammars and put 
%require 3.3 to make sure that only fixed versions of Bison are used.  After 
all, there are probably very few (none?) people hit by this bug: only those 
that have several RR conflicts in the same state for the same lookahead, and 
only if they are in GLR, since otherwise %expect-rr is ignored.  Our testsuite 
is affected, yes, but it was asking quite explicitly for trouble:

> start:
>     ambig0 'a'   { destructors += 2; USE ($2); }
>   | ambig1 start { destructors += 1; }
>   | ambig2 start { destructors += 1; }
>   ;
> 
> ambig0: 'a' ;
> ambig1: 'a' ;
> ambig2: 'a' ;




What do people think?  Can we apply as is, or we need to have a transition for 
%expect-rr?

commit a55c73f8e638192051f3e9b859ba124fd5f4a8e4
Author: Akim Demaille <address@hidden>
Date:   Thu Nov 8 21:46:16 2018 +0100

    fix the computation of the RR conflicts
    
    On a grammar such as
    
       exp: "num" | "num" | "num"
    
    we currently report only one RR conflict, instead of two.
    
    * src/conflicts.h, src/conflicts.c (count_state_rr_conflicts)
    (count_rr_conflicts): Use only the correct count of conflicts.
    * tests/glr-regression.at: Fix expectations.

diff --git a/src/conflicts.c b/src/conflicts.c
index 2838243e..af67aaea 100644
--- a/src/conflicts.c
+++ b/src/conflicts.c
@@ -467,15 +467,13 @@ count_sr_conflicts (void)
 
 
 
-/*----------------------------------------------------------------.
-| Count the number of reduce/reduce conflicts.  If ONE_PER_TOKEN, |
-| count one conflict for each token that has any reduce/reduce    |
-| conflicts.  Otherwise, count one conflict for each reduction    |
-| after the first for a given token.                              |
-`----------------------------------------------------------------*/
+/*-----------------------------------------------------------------.
+| Count the number of reduce/reduce conflicts.  Count one conflict |
+| for each reduction after the first for a given token.            |
+`-----------------------------------------------------------------*/
 
 static size_t
-count_state_rr_conflicts (state *s, bool one_per_token)
+count_state_rr_conflicts (state *s)
 {
   reductions *reds = s->reductions;
   size_t res = 0;
@@ -486,20 +484,20 @@ count_state_rr_conflicts (state *s, bool one_per_token)
       for (int j = 0; j < reds->num; ++j)
         count += bitset_test (reds->lookahead_tokens[j], i);
       if (count >= 2)
-        res += one_per_token ? 1 : count-1;
+        res += count-1;
     }
 
   return res;
 }
 
 static size_t
-count_rr_conflicts (bool one_per_token)
+count_rr_conflicts (void)
 {
   size_t res = 0;
   /* Conflicts by state.  */
   for (state_number i = 0; i < nstates; ++i)
     if (conflicts[i])
-      res += count_state_rr_conflicts (states[i], one_per_token);
+      res += count_state_rr_conflicts (states[i]);
   return res;
 }
 
@@ -591,7 +589,7 @@ conflicts_output (FILE *out)
       if (conflicts[i])
         {
           int src = count_state_sr_conflicts (s);
-          int rrc = count_state_rr_conflicts (s, true);
+          int rrc = count_state_rr_conflicts (s);
           fprintf (out, _("State %d "), i);
           if (src && rrc)
             fprintf (out,
@@ -608,17 +606,14 @@ conflicts_output (FILE *out)
     fputs ("\n\n", out);
 }
 
-/*--------------------------------------------------------.
-| Total the number of S/R and R/R conflicts.  Unlike the  |
-| code in conflicts_output, however, count EACH pair of   |
-| reductions for the same state and lookahead as one      |
-| conflict.                                               |
-`--------------------------------------------------------*/
+/*--------------------------------------------.
+| Total the number of S/R and R/R conflicts.  |
+`--------------------------------------------*/
 
 int
 conflicts_total_count (void)
 {
-  return count_sr_conflicts () + count_rr_conflicts (false);
+  return count_sr_conflicts () + count_rr_conflicts ();
 }
 
 /*------------------------------.
@@ -692,7 +687,7 @@ conflicts_print (void)
   }
 
   {
-    int total = count_rr_conflicts (true);
+    int total = count_rr_conflicts ();
     /* If %expect-rr is not used, but %expect is, then expect 0 rr.  */
     int expected =
       (expected_rr_conflicts == -1 && expected_sr_conflicts != -1)
diff --git a/src/conflicts.h b/src/conflicts.h
index 22a177cb..7524ef4e 100644
--- a/src/conflicts.h
+++ b/src/conflicts.h
@@ -44,4 +44,5 @@ void conflicts_free (void);
 /* Were there conflicts? */
 extern int expected_sr_conflicts;
 extern int expected_rr_conflicts;
+
 #endif /* !CONFLICTS_H_ */
diff --git a/tests/glr-regression.at b/tests/glr-regression.at
index bb03a15a..bd6ed267 100644
--- a/tests/glr-regression.at
+++ b/tests/glr-regression.at
@@ -342,7 +342,7 @@ AT_BISON_OPTION_POPDEFS
 
 AT_BISON_CHECK([[-o glr-regr3.c -rall glr-regr3.y]], 0, [],
 [[glr-regr3.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
-glr-regr3.y: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
+glr-regr3.y: warning: 2 reduce/reduce conflict [-Wconflicts-rr]
 ]])
 AT_COMPILE([glr-regr3])
 
@@ -437,7 +437,7 @@ merge (YYSTYPE s1, YYSTYPE s2)
 AT_BISON_OPTION_POPDEFS
 
 AT_BISON_CHECK([[-o glr-regr4.c -rall glr-regr4.y]], 0, [],
-[[glr-regr4.y: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
+[[glr-regr4.y: warning: 2 reduce/reduce conflict [-Wconflicts-rr]
 ]])
 AT_COMPILE([glr-regr4])
 
@@ -799,7 +799,7 @@ main (void)
 AT_BISON_OPTION_POPDEFS
 
 AT_BISON_CHECK([[-o glr-regr9.c -rall glr-regr9.y]], 0, [],
-[[glr-regr9.y: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
+[[glr-regr9.y: warning: 2 reduce/reduce conflict [-Wconflicts-rr]
 ]])
 AT_COMPILE([glr-regr9])
 
@@ -1363,7 +1363,7 @@ main (void)
 AT_BISON_OPTION_POPDEFS
 
 AT_BISON_CHECK([[-o glr-regr14.c -rall glr-regr14.y]], 0, [],
-[[glr-regr14.y: warning: 3 reduce/reduce conflicts [-Wconflicts-rr]
+[[glr-regr14.y: warning: 5 reduce/reduce conflicts [-Wconflicts-rr]
 ]])
 AT_COMPILE([glr-regr14])
[Prev in Thread]
Current Thread
[Next in Thread]
Ye oldest bug in Bison?, Akim Demaille <=
- Re: Ye oldest bug in Bison?, Paul Eggert, 2018/11/08
Prev by Date: Re: [RFC] Allow %expect and %expect annotations on rules
Next by Date: Re: Ye oldest bug in Bison?
Previous by thread: Re: [RFC] Allow %expect and %expect annotations on rules
Next by thread: Re: Ye oldest bug in Bison?
Index(es):
- Date
- Thread