Re: [RFC] Allow %expect and %expect annotations on rules

bison-patches
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC] Allow %expect and %expect annotations on rules

From:	Akim Demaille
Subject:	Re: [RFC] Allow %expect and %expect annotations on rules
Date:	Wed, 27 Feb 2013 14:42:57 +0100
Hi Paul!

Le 27 févr. 2013 à 01:28, Paul Hilfinger <address@hidden> a écrit :

> Especially when writing GLR parsers, I'd like to be able to document
> (and check) rules where I'm depending on GLR to resolve conflicts.  The
> enclosed patch provides that.  I'd appreciate your comments.

Wow, how on Earth did we never think about that before!
That's a very nice idea!

> At the moment, the patch allows you to document just SOME of the rules
> involved in conflicts, and makes no additional comment on rules where
> you don't.  It also still requires one to give an overall %expect or
> %expect-rr declaration for total conflicts.  Perhaps if one comments any
> one specific rule, all others should be considered as marked with 0
> expected conflicts, and the overall declarations should not be
> required.  Again, I'd appreciate your thoughts.

If the information is just redundant between both, I would make them
exclusive: use either global or local annotations, but not both.  And
yes, if used once, it is enabled for everywhere.  Something like what
Joel suggested for %empty: _unless_ -Wno-empty-rule was specified, any
use of %empty enables -Wempty-rule.

Actually, why did you chose to count the number of conflicted states
by rule rather than the number of conflicts themselves (I think
about the number of conflicted reductions)?

Even so, I would say that in the following example there are two
states for '+' and two states for '*' (the same states) that are
involved in the conflicts, not 1, as Bison seems to be expecting.

$ cat /tmp/f.y
%%
exp:
  exp '+' exp %expect 2
| exp '*' exp %expect 2
| "number"
;
$ LC_ALL=C ./_build/48d-debug/tests/bison -Wall /tmp/f.y -v
/tmp/f.y:3.3-23: error: shift/reduce conflicts for rule 1: 1 found, 2 expected
   exp '+' exp %expect 2
   ^^^^^^^^^^^^^^^^^^^^^
/tmp/f.y:4.3-23: error: shift/reduce conflicts for rule 2: 1 found, 2 expected
 | exp '*' exp %expect 2
   ^^^^^^^^^^^^^^^^^^^^^
/tmp/f.y: warning: 4 shift/reduce conflicts [-Wconflicts-sr]

[…]

State 6

    1 exp: exp . '+' exp
    1    | exp '+' exp .
    2    | exp . '*' exp

    '+'  shift, and go to state 4
    '*'  shift, and go to state 5

    '+'       [reduce using rule 1 (exp)]
    '*'       [reduce using rule 1 (exp)]
    $default  reduce using rule 1 (exp)


State 7

    1 exp: exp . '+' exp
    2    | exp . '*' exp
    2    | exp '*' exp .

    '+'  shift, and go to state 4
    '*'  shift, and go to state 5

    '+'       [reduce using rule 2 (exp)]
    '*'       [reduce using rule 2 (exp)]
    $default  reduce using rule 2 (exp)


> 
>    In GLR parsers, we can use %expect-rr in a rule for reduce/reduce
>    conflicts.  In this case, we mark each of the conflicting rules.  For
>    example,
> 
>        %glr-parser
>        %expect-rr 1
> 
>        %%
> 
>        stmt:
>          target_list '=' expr ';'
>        | expr_list ';'
>        ;
> 
>        target_list:
>          target
>        | target ',' target_list
>        ;
> 
>        target:
>          ID %expect-rr 1

Wouldn't it be better to "name" the conflict and flag each rule
that participates in an R/R conflict to be named after it?

>        ;
> 
>        expr_list:
>          expr
>        | expr ',' expr_list
>        ;
> 
>        expr:
>          ID %expect-rr 1
>        | ...
>        ;
> 
>    In a statement such as
> 
>        x, y = 3, 4;
> 
>    the parser must reduce x to a target or an expr, but does not know
>    which until it sees the '='.  So we notate the two possible reductions

to "notate"?

>    to indicate that each conflicts in one rule.
> 
> 
>    * src/parse-gram.c: Regenerate.
>    * src/parse-gram.h: Regenerate.

Could you separate them from this commit and make a second
commit just for them?

> diff --git a/doc/bison.texi b/doc/bison.texi
> index 328b88b..0f36c9e 100644
> --- a/doc/bison.texi
> +++ b/doc/bison.texi
> @@ -5070,6 +5070,53 @@ in GLR parsers, using the declaration:
> %expect-rr @var{n}
> @end example
> 
> +You may wish to be more specific in your
> +specification of expected conflicts.  To this end, you can also attach
> address@hidden and @code{%expect-rr} modifiers to individual rules.
> +The interpretation of these modifiers differs from their use as
> +declarations.  When attached to rules, they indicate the number of states
> +in which the rule is involved in a conflict.  You will need to consult the
> +output resulting from @samp{-v} to determine appropriate numbers to use.

@option{-v}

> +For example, for the following grammar fragment, the first rule for
> address@hidden appears in two states in which the @samp{[} token is a
> +lookahead.  Having determined that, you can document this fact with an
> address@hidden modifier as follows:
> +
> address@hidden
> +dims:
> +  empty_dims
> +| '[' expr ']' dims
> +;
> +
> +empty_dims:
> +  %empty   %expect 2
> +| empty_dims '[' ']'
> +;
> address@hidden example
> +
> +Mid-rule actions generate implicit rules that are also subject to conflicts
> +(@pxref{Mid-Rule Conflicts,, Conflicts due to Mid-Rule Actions}). To attach
> +an @code{%expect} or @code{%expect-rr} annotation to an implicit
> +mid-rule action's rule, put it before the action.  For example,
> +
> address@hidden
> +%glr-parser
> +%expect-rr 1
> +
> +%%
> +
> +clause:
> +  "condition" %expect-rr 1 @{ value_mode(); @} '(' exprs ')'
> +| "condition" %expect-rr 1 @{ class_mode(); @} '(' types ')'
> +;
> address@hidden example
> +
> address@hidden
> +Here, the appropriate mid-rule action will not be determined until after 
> +the @samp{(} token is shifted.  Thus,
> +the two actions will clash with each other, and we should expect one
> +reduce/reduce conflict for each.
> +
> In general, using @code{%expect} involves these steps:
> 
> @itemize @bullet
> @@ -5085,8 +5132,17 @@ go back to the beginning.
> 
> @item
> Add an @code{%expect} declaration, copying the number @var{n} from the
> -number which Bison printed.  With GLR parsers, add an
> +number that Bison printed.  With GLR parsers, add an
> @code{%expect-rr} declaration as well.
> +
> address@hidden
> +Optionally, count up the number of states in which one or more
> +conflicted reductions for particular rules appear and add these numbers
> +to the affected rules as @code{%expect-rr} or @code{%expect} modifiers
> +as appropriate.  Rules that are in conflict appear in the output listing
> +surrounded by square brackets or, in the case of reduce/reduce conflicts,
> +as reductions having the same lookahead symbol as a square-bracketed
> +reduction in the same state.
> @end itemize
> 
> Now Bison will report an error if you introduce an unexpected conflict,
> @@ -5310,7 +5366,14 @@ Start-Symbol}).
> @end deffn
> 
> @deffn {Directive} %expect
> -Declare the expected number of shift-reduce conflicts
> +Declare the expected number of shift-reduce conflicts, either overall or
> +for a given rule
> +(@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
> address@hidden deffn
> +
> address@hidden {Directive} %expect-rr
> +Declare the expected number of reduce-reduce conflicts, either overall or
> +for a given rule
> (@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
> @end deffn
> 
> diff --git a/src/conflicts.c b/src/conflicts.c
> index 1840473..3363270 100644
> --- a/src/conflicts.c
> +++ b/src/conflicts.c
> @@ -473,8 +473,8 @@ count_sr_conflicts (void)
> /*----------------------------------------------------------------.
> | Count the number of reduce/reduce conflicts.  If ONE_PER_TOKEN, |
> | count one conflict for each token that has any reduce/reduce    |
> -| conflicts.  Otherwise, count one conflict for each pair of      |
> -| conflicting reductions.                                         |
> +| conflicts.  Otherwise, count one conflict for each reduction    |
> +| after the first for a given token.                              |
> `----------------------------------------------------------------*/
> 
> static size_t
> @@ -511,6 +511,86 @@ count_rr_conflicts (bool one_per_token)
> }
> 
> 
> +/*----------------------------------------------------------------------.
> +| For a given rule, count the number of states for which it is involved |
> +| in shift/reduce conflicts.                                               |
> +`----------------------------------------------------------------------*/
> +
> +static bool
> +rule_has_state_sr_conflicts (rule *r, state *s)
> +{
> +  int i;
> +  int j;
> +  transitions *trans = s->transitions;
> +  reductions *reds = s->reductions;
> +
> +  for (i = 0; i < reds->num; i++)
> +    if (reds->rules[i] == r)
> +      break;
> +  if (i >= reds->num)
> +    return false;
> +
> +  FOR_EACH_SHIFT (trans, j)
> +    if (bitset_test (reds->lookahead_tokens[i], TRANSITION_SYMBOL (trans, 
> j)))
> +      return true;
> +
> +  return false;
> +}
> +
> +static size_t
> +count_rule_sr_conflicts (rule *r)
> +{
> +  state_number i;
> +  size_t res;
> +
> +  res = 0;

I'd prefer

        size_t res = 0;
        state_number i;
        for (i…)

> +  for (i = 0; i < nstates; i++)
> +    if (conflicts[i] && rule_has_state_sr_conflicts (r, states[i]))
> +      res++;
> +  return res;
> +}
> +
> +/*-----------------------------------------------------------------.
> +| For a given rule, count the number of states in which it is      |
> +| involved in reduce/reduce conflicts.                             |
> +`-----------------------------------------------------------------*/
> +
> +static bool
> +rule_has_state_rr_conflicts (rule *r, state *s)
> +{
> +  int i;
> +  reductions *reds = s->reductions;
> +  size_t res;

Useless res.

> +  bitset lookaheads;
> +  
> +  for (i = 0; i < reds->num; i++)
> +    if (reds->rules[i] == r)
> +      break;
> +  if (i >= reds->num)
> +    return 0;

Wrong type.

> +  lookaheads = reds->lookahead_tokens[i];
> +
> +  for (i = 0; i < reds->num; i++)
> +    if (reds->rules[i] != r &&
> +        !bitset_disjoint_p (lookaheads, reds->lookahead_tokens[i]))
> +      return true;
> +
> +  return false;
> +}
> +
> +static size_t
> +count_rule_rr_conflicts (rule *r)
> +{
> +  state_number i;
> +  size_t res;
> +
> +  res = 0;

Likewise.

> +  for (i = 0; i < nstates; i++)
> +    if (conflicts[i] && rule_has_state_rr_conflicts (r, states[i]))
> +     res++;
> +  return res;
> +}
> +
> /*-----------------------------------------------------------.
> | Output the detailed description of states with conflicts.  |
> `-----------------------------------------------------------*/
> @@ -556,14 +636,46 @@ conflicts_total_count (void)
>   return count_sr_conflicts () + count_rr_conflicts (false);
> }
> 
> +/*------------------------------.
> +| Reporting per-rule conflicts. |
> +`------------------------------*/
> 
> -/*------------------------------------------.
> -| Reporting the total number of conflicts.  |
> -`------------------------------------------*/
> +static void
> +rule_conflicts_print (void)
> +{
> +  rule_number i;
> +
> +  for (i = 0; i < nrules; i += 1)
> +    {
> +      rule *r = &rules[i];
> +      int expected_sr = r->expected_sr_conflicts;
> +      int expected_rr = r->expected_rr_conflicts;
> +
> +      if (expected_sr != -1 || expected_rr != -1) 
> +        {
> +          int sr = count_rule_sr_conflicts (r);
> +          int rr = count_rule_rr_conflicts (r);
> +          if (sr != expected_sr && (sr != 0 || expected_sr != -1))
> +            complain (&r->location, complaint, _("\
> +shift/reduce conflicts for rule %d: %d found, %d expected"),
> +                      r->user_number, sr, expected_sr);

            complain (&r->location, complaint,
                      _("shift/reduce conflicts for rule %d: "
                        "%d found, %d expected"),
                      r->user_number, sr, expected_sr);


> +          if (rr != expected_rr && (rr != 0 || expected_rr != -1))
> +            complain (&r->location, complaint, _("\
> +reduce/reduce conflicts for rule %d: %d found, %d expected"),
> +                      r->user_number, rr, expected_rr);

Likewise.

> +        }
> +    }
> +}
> +
> +/*---------------------------------.
> +| Reporting numbers of conflicts.  |
> +`---------------------------------*/
> 
> void
> conflicts_print (void)
> {
> +  rule_conflicts_print ();
> +
>   if (! glr_parser && expected_rr_conflicts != -1)
>     {
>       complain (NULL, Wother, _("%%expect-rr applies only to GLR parsers"));
> @@ -617,7 +729,6 @@ conflicts_print (void)
>   }
> }
> 
> -
> void
> conflicts_free (void)
> {
> diff --git a/src/gram.h b/src/gram.h


> diff --git a/src/reader.c b/src/reader.c
> index dbf4e95..ab5fc8a 100644
> --- a/src/reader.c
> +++ b/src/reader.c
> @@ -405,6 +405,11 @@ grammar_midrule_action (void)
>                                current_rule->action_props.is_predicate);
>   code_props_none_init (&current_rule->action_props);
> 
> +  midrule->expected_sr_conflicts = current_rule->expected_sr_conflicts;
> +  midrule->expected_rr_conflicts = current_rule->expected_rr_conflicts;
> +  current_rule->expected_sr_conflicts = -1;
> +  current_rule->expected_rr_conflicts = -1;
> +
>   if (previous_rule_end)
>     previous_rule_end->next = midrule;
>   else
> @@ -535,6 +540,26 @@ grammar_current_rule_action_append (const char *action, 
> location loc,
>                                current_rule, name, is_predicate);
> }
> 
> +/* Set the expected number of shift-reduce (reduce-reduce) conflicts for
> + * the current rule.  If a midrule is encountered later, the count
> + * is transferred to it and reset in the current rule to -1. */
> +
> +void
> +grammar_current_rule_expect_sr (int count, location loc)
> +{
> +  current_rule->expected_sr_conflicts = count;

Unused loc.  Could you use the same pattern as the other
similar directives?  With duplicate_directive.

> +}
> +
> +void
> +grammar_current_rule_expect_rr (int count, location loc)
> +{

Likewise.

> +  if (! glr_parser)
> +    complain (&loc, Wother, _("%s affects only GLR parsers"),
> +              "%expect-rr");
> +  else
> +    current_rule->expected_rr_conflicts = count;
> +}
> +
> 
> /*---------------------------------------------------------------.
> | Convert the rules into the representation using RRHS, RLHS and |


> diff --git a/src/reader.h b/src/reader.h
> index ba6ffe6..c7987f9 100644
> --- a/src/reader.h
> +++ b/src/reader.h
> @@ -52,6 +52,8 @@ void grammar_current_rule_empty_set (location loc);
> void grammar_current_rule_prec_set (symbol *precsym, location loc);
> void grammar_current_rule_dprec_set (int dprec, location loc);
> void grammar_current_rule_merge_set (uniqstr name, location loc);
> +void grammar_current_rule_expect_sr (int count, location loc);
> +void grammar_current_rule_expect_rr (int count, location loc);

The others use "*_set".

> void grammar_current_rule_symbol_append (symbol *sym, location loc,
>                                          named_ref *nref);
> void grammar_current_rule_action_append (const char *action, location loc,


> diff --git a/tests/conflicts.at b/tests/conflicts.at
> index 07ff178..0ad395f 100644
> --- a/tests/conflicts.at
> +++ b/tests/conflicts.at
> @@ -1140,6 +1140,61 @@ AT_BISON_CHECK([-o input.c input.y], 1, [],
> AT_CLEANUP
> 
> 
> +## ------------------------------------ ##
> +## %expect in grammar rule not enough.  ##
> +## ------------------------------------ ##
> +
> +AT_SETUP([%expect in grammar rule not enough])
> +
> +AT_DATA([input.y],
> +[[%token NUM OP
> +%expect 1
> +%%
> +exp: exp OP exp %expect 0 | NUM;
> +]])
> +
> +AT_BISON_CHECK([-o input.c input.y], 1, [],
> +[[input.y:4.6-25: error: shift/reduce conflicts for rule 1: 1 found, 0 
> expected

We should point to the %directive rather than the rule, no?
And if there is no directive, the whole rule, indeed.

> +]])
> +AT_CLEANUP
> +
> +
> +## ------------------------------- ##
> +## %expect in grammar rule right.  ##
> +## ------------------------------- ##
> +
> +AT_SETUP([%expect in grammar rule right])
> +
> +AT_DATA([input.y],
> +[[%token NUM OP
> +%expect 1
> +%%
> +exp: exp OP exp %expect 1 | NUM;
> +]])
> +
> +AT_BISON_CHECK([-o input.c input.y])
> +AT_CLEANUP

These three tests could be easily be factored to take
the 0, 1, 2 argument to pass to the first rule, and $2
would be the expected error message.  This I can do :)

> +
> +
> +## ---------------------------------- ##
> +## %expect in grammar rule too much.  ##
> +## ---------------------------------- ##
> +
> +AT_SETUP([%expect in grammar rule too much])
> +
> +AT_DATA([input.y],
> +[[%token NUM OP
> +%expect 1
> +%%
> +exp: exp OP exp | NUM %expect 1;
> +]])
> +
> +AT_BISON_CHECK([-o input.c input.y], 1, [],
> +[[input.y:4.19-31: error: shift/reduce conflicts for rule 2: 0 found, 1 
> expected
> +]])
> +AT_CLEANUP

You don't seem to have any test for R/R conflicts.  It would
probably be also useful to check the case of _resolved_ conflicts.
[Prev in Thread]
Current Thread
[Next in Thread]
[RFC] Allow %expect and %expect annotations on rules, Paul Hilfinger, 2013/02/26
- Re: [RFC] Allow %expect and %expect annotations on rules, Akim Demaille <=
Prev by Date: [RFC] Allow %expect and %expect annotations on rules
Next by Date: Re: get -Wyacc out of -Wall (was: RFC: explicit empty right-hand side of a rule)
Previous by thread: [RFC] Allow %expect and %expect annotations on rules
Index(es):
- Date
- Thread