[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC] Allow %expect and %expect annotations on rules
From: |
Akim Demaille |
Subject: |
Re: [RFC] Allow %expect and %expect annotations on rules |
Date: |
Wed, 27 Feb 2013 14:42:57 +0100 |
Hi Paul!
Le 27 févr. 2013 à 01:28, Paul Hilfinger <address@hidden> a écrit :
> Especially when writing GLR parsers, I'd like to be able to document
> (and check) rules where I'm depending on GLR to resolve conflicts. The
> enclosed patch provides that. I'd appreciate your comments.
Wow, how on Earth did we never think about that before!
That's a very nice idea!
> At the moment, the patch allows you to document just SOME of the rules
> involved in conflicts, and makes no additional comment on rules where
> you don't. It also still requires one to give an overall %expect or
> %expect-rr declaration for total conflicts. Perhaps if one comments any
> one specific rule, all others should be considered as marked with 0
> expected conflicts, and the overall declarations should not be
> required. Again, I'd appreciate your thoughts.
If the information is just redundant between both, I would make them
exclusive: use either global or local annotations, but not both. And
yes, if used once, it is enabled for everywhere. Something like what
Joel suggested for %empty: _unless_ -Wno-empty-rule was specified, any
use of %empty enables -Wempty-rule.
Actually, why did you chose to count the number of conflicted states
by rule rather than the number of conflicts themselves (I think
about the number of conflicted reductions)?
Even so, I would say that in the following example there are two
states for '+' and two states for '*' (the same states) that are
involved in the conflicts, not 1, as Bison seems to be expecting.
$ cat /tmp/f.y
%%
exp:
exp '+' exp %expect 2
| exp '*' exp %expect 2
| "number"
;
$ LC_ALL=C ./_build/48d-debug/tests/bison -Wall /tmp/f.y -v
/tmp/f.y:3.3-23: error: shift/reduce conflicts for rule 1: 1 found, 2 expected
exp '+' exp %expect 2
^^^^^^^^^^^^^^^^^^^^^
/tmp/f.y:4.3-23: error: shift/reduce conflicts for rule 2: 1 found, 2 expected
| exp '*' exp %expect 2
^^^^^^^^^^^^^^^^^^^^^
/tmp/f.y: warning: 4 shift/reduce conflicts [-Wconflicts-sr]
[…]
State 6
1 exp: exp . '+' exp
1 | exp '+' exp .
2 | exp . '*' exp
'+' shift, and go to state 4
'*' shift, and go to state 5
'+' [reduce using rule 1 (exp)]
'*' [reduce using rule 1 (exp)]
$default reduce using rule 1 (exp)
State 7
1 exp: exp . '+' exp
2 | exp . '*' exp
2 | exp '*' exp .
'+' shift, and go to state 4
'*' shift, and go to state 5
'+' [reduce using rule 2 (exp)]
'*' [reduce using rule 2 (exp)]
$default reduce using rule 2 (exp)
>
> In GLR parsers, we can use %expect-rr in a rule for reduce/reduce
> conflicts. In this case, we mark each of the conflicting rules. For
> example,
>
> %glr-parser
> %expect-rr 1
>
> %%
>
> stmt:
> target_list '=' expr ';'
> | expr_list ';'
> ;
>
> target_list:
> target
> | target ',' target_list
> ;
>
> target:
> ID %expect-rr 1
Wouldn't it be better to "name" the conflict and flag each rule
that participates in an R/R conflict to be named after it?
> ;
>
> expr_list:
> expr
> | expr ',' expr_list
> ;
>
> expr:
> ID %expect-rr 1
> | ...
> ;
>
> In a statement such as
>
> x, y = 3, 4;
>
> the parser must reduce x to a target or an expr, but does not know
> which until it sees the '='. So we notate the two possible reductions
to "notate"?
> to indicate that each conflicts in one rule.
>
>
> * src/parse-gram.c: Regenerate.
> * src/parse-gram.h: Regenerate.
Could you separate them from this commit and make a second
commit just for them?
> diff --git a/doc/bison.texi b/doc/bison.texi
> index 328b88b..0f36c9e 100644
> --- a/doc/bison.texi
> +++ b/doc/bison.texi
> @@ -5070,6 +5070,53 @@ in GLR parsers, using the declaration:
> %expect-rr @var{n}
> @end example
>
> +You may wish to be more specific in your
> +specification of expected conflicts. To this end, you can also attach
> address@hidden and @code{%expect-rr} modifiers to individual rules.
> +The interpretation of these modifiers differs from their use as
> +declarations. When attached to rules, they indicate the number of states
> +in which the rule is involved in a conflict. You will need to consult the
> +output resulting from @samp{-v} to determine appropriate numbers to use.
@option{-v}
> +For example, for the following grammar fragment, the first rule for
> address@hidden appears in two states in which the @samp{[} token is a
> +lookahead. Having determined that, you can document this fact with an
> address@hidden modifier as follows:
> +
> address@hidden
> +dims:
> + empty_dims
> +| '[' expr ']' dims
> +;
> +
> +empty_dims:
> + %empty %expect 2
> +| empty_dims '[' ']'
> +;
> address@hidden example
> +
> +Mid-rule actions generate implicit rules that are also subject to conflicts
> +(@pxref{Mid-Rule Conflicts,, Conflicts due to Mid-Rule Actions}). To attach
> +an @code{%expect} or @code{%expect-rr} annotation to an implicit
> +mid-rule action's rule, put it before the action. For example,
> +
> address@hidden
> +%glr-parser
> +%expect-rr 1
> +
> +%%
> +
> +clause:
> + "condition" %expect-rr 1 @{ value_mode(); @} '(' exprs ')'
> +| "condition" %expect-rr 1 @{ class_mode(); @} '(' types ')'
> +;
> address@hidden example
> +
> address@hidden
> +Here, the appropriate mid-rule action will not be determined until after
> +the @samp{(} token is shifted. Thus,
> +the two actions will clash with each other, and we should expect one
> +reduce/reduce conflict for each.
> +
> In general, using @code{%expect} involves these steps:
>
> @itemize @bullet
> @@ -5085,8 +5132,17 @@ go back to the beginning.
>
> @item
> Add an @code{%expect} declaration, copying the number @var{n} from the
> -number which Bison printed. With GLR parsers, add an
> +number that Bison printed. With GLR parsers, add an
> @code{%expect-rr} declaration as well.
> +
> address@hidden
> +Optionally, count up the number of states in which one or more
> +conflicted reductions for particular rules appear and add these numbers
> +to the affected rules as @code{%expect-rr} or @code{%expect} modifiers
> +as appropriate. Rules that are in conflict appear in the output listing
> +surrounded by square brackets or, in the case of reduce/reduce conflicts,
> +as reductions having the same lookahead symbol as a square-bracketed
> +reduction in the same state.
> @end itemize
>
> Now Bison will report an error if you introduce an unexpected conflict,
> @@ -5310,7 +5366,14 @@ Start-Symbol}).
> @end deffn
>
> @deffn {Directive} %expect
> -Declare the expected number of shift-reduce conflicts
> +Declare the expected number of shift-reduce conflicts, either overall or
> +for a given rule
> +(@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
> address@hidden deffn
> +
> address@hidden {Directive} %expect-rr
> +Declare the expected number of reduce-reduce conflicts, either overall or
> +for a given rule
> (@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
> @end deffn
>
> diff --git a/src/conflicts.c b/src/conflicts.c
> index 1840473..3363270 100644
> --- a/src/conflicts.c
> +++ b/src/conflicts.c
> @@ -473,8 +473,8 @@ count_sr_conflicts (void)
> /*----------------------------------------------------------------.
> | Count the number of reduce/reduce conflicts. If ONE_PER_TOKEN, |
> | count one conflict for each token that has any reduce/reduce |
> -| conflicts. Otherwise, count one conflict for each pair of |
> -| conflicting reductions. |
> +| conflicts. Otherwise, count one conflict for each reduction |
> +| after the first for a given token. |
> `----------------------------------------------------------------*/
>
> static size_t
> @@ -511,6 +511,86 @@ count_rr_conflicts (bool one_per_token)
> }
>
>
> +/*----------------------------------------------------------------------.
> +| For a given rule, count the number of states for which it is involved |
> +| in shift/reduce conflicts. |
> +`----------------------------------------------------------------------*/
> +
> +static bool
> +rule_has_state_sr_conflicts (rule *r, state *s)
> +{
> + int i;
> + int j;
> + transitions *trans = s->transitions;
> + reductions *reds = s->reductions;
> +
> + for (i = 0; i < reds->num; i++)
> + if (reds->rules[i] == r)
> + break;
> + if (i >= reds->num)
> + return false;
> +
> + FOR_EACH_SHIFT (trans, j)
> + if (bitset_test (reds->lookahead_tokens[i], TRANSITION_SYMBOL (trans,
> j)))
> + return true;
> +
> + return false;
> +}
> +
> +static size_t
> +count_rule_sr_conflicts (rule *r)
> +{
> + state_number i;
> + size_t res;
> +
> + res = 0;
I'd prefer
size_t res = 0;
state_number i;
for (i…)
> + for (i = 0; i < nstates; i++)
> + if (conflicts[i] && rule_has_state_sr_conflicts (r, states[i]))
> + res++;
> + return res;
> +}
> +
> +/*-----------------------------------------------------------------.
> +| For a given rule, count the number of states in which it is |
> +| involved in reduce/reduce conflicts. |
> +`-----------------------------------------------------------------*/
> +
> +static bool
> +rule_has_state_rr_conflicts (rule *r, state *s)
> +{
> + int i;
> + reductions *reds = s->reductions;
> + size_t res;
Useless res.
> + bitset lookaheads;
> +
> + for (i = 0; i < reds->num; i++)
> + if (reds->rules[i] == r)
> + break;
> + if (i >= reds->num)
> + return 0;
Wrong type.
> + lookaheads = reds->lookahead_tokens[i];
> +
> + for (i = 0; i < reds->num; i++)
> + if (reds->rules[i] != r &&
> + !bitset_disjoint_p (lookaheads, reds->lookahead_tokens[i]))
> + return true;
> +
> + return false;
> +}
> +
> +static size_t
> +count_rule_rr_conflicts (rule *r)
> +{
> + state_number i;
> + size_t res;
> +
> + res = 0;
Likewise.
> + for (i = 0; i < nstates; i++)
> + if (conflicts[i] && rule_has_state_rr_conflicts (r, states[i]))
> + res++;
> + return res;
> +}
> +
> /*-----------------------------------------------------------.
> | Output the detailed description of states with conflicts. |
> `-----------------------------------------------------------*/
> @@ -556,14 +636,46 @@ conflicts_total_count (void)
> return count_sr_conflicts () + count_rr_conflicts (false);
> }
>
> +/*------------------------------.
> +| Reporting per-rule conflicts. |
> +`------------------------------*/
>
> -/*------------------------------------------.
> -| Reporting the total number of conflicts. |
> -`------------------------------------------*/
> +static void
> +rule_conflicts_print (void)
> +{
> + rule_number i;
> +
> + for (i = 0; i < nrules; i += 1)
> + {
> + rule *r = &rules[i];
> + int expected_sr = r->expected_sr_conflicts;
> + int expected_rr = r->expected_rr_conflicts;
> +
> + if (expected_sr != -1 || expected_rr != -1)
> + {
> + int sr = count_rule_sr_conflicts (r);
> + int rr = count_rule_rr_conflicts (r);
> + if (sr != expected_sr && (sr != 0 || expected_sr != -1))
> + complain (&r->location, complaint, _("\
> +shift/reduce conflicts for rule %d: %d found, %d expected"),
> + r->user_number, sr, expected_sr);
complain (&r->location, complaint,
_("shift/reduce conflicts for rule %d: "
"%d found, %d expected"),
r->user_number, sr, expected_sr);
> + if (rr != expected_rr && (rr != 0 || expected_rr != -1))
> + complain (&r->location, complaint, _("\
> +reduce/reduce conflicts for rule %d: %d found, %d expected"),
> + r->user_number, rr, expected_rr);
Likewise.
> + }
> + }
> +}
> +
> +/*---------------------------------.
> +| Reporting numbers of conflicts. |
> +`---------------------------------*/
>
> void
> conflicts_print (void)
> {
> + rule_conflicts_print ();
> +
> if (! glr_parser && expected_rr_conflicts != -1)
> {
> complain (NULL, Wother, _("%%expect-rr applies only to GLR parsers"));
> @@ -617,7 +729,6 @@ conflicts_print (void)
> }
> }
>
> -
> void
> conflicts_free (void)
> {
> diff --git a/src/gram.h b/src/gram.h
> diff --git a/src/reader.c b/src/reader.c
> index dbf4e95..ab5fc8a 100644
> --- a/src/reader.c
> +++ b/src/reader.c
> @@ -405,6 +405,11 @@ grammar_midrule_action (void)
> current_rule->action_props.is_predicate);
> code_props_none_init (¤t_rule->action_props);
>
> + midrule->expected_sr_conflicts = current_rule->expected_sr_conflicts;
> + midrule->expected_rr_conflicts = current_rule->expected_rr_conflicts;
> + current_rule->expected_sr_conflicts = -1;
> + current_rule->expected_rr_conflicts = -1;
> +
> if (previous_rule_end)
> previous_rule_end->next = midrule;
> else
> @@ -535,6 +540,26 @@ grammar_current_rule_action_append (const char *action,
> location loc,
> current_rule, name, is_predicate);
> }
>
> +/* Set the expected number of shift-reduce (reduce-reduce) conflicts for
> + * the current rule. If a midrule is encountered later, the count
> + * is transferred to it and reset in the current rule to -1. */
> +
> +void
> +grammar_current_rule_expect_sr (int count, location loc)
> +{
> + current_rule->expected_sr_conflicts = count;
Unused loc. Could you use the same pattern as the other
similar directives? With duplicate_directive.
> +}
> +
> +void
> +grammar_current_rule_expect_rr (int count, location loc)
> +{
Likewise.
> + if (! glr_parser)
> + complain (&loc, Wother, _("%s affects only GLR parsers"),
> + "%expect-rr");
> + else
> + current_rule->expected_rr_conflicts = count;
> +}
> +
>
> /*---------------------------------------------------------------.
> | Convert the rules into the representation using RRHS, RLHS and |
> diff --git a/src/reader.h b/src/reader.h
> index ba6ffe6..c7987f9 100644
> --- a/src/reader.h
> +++ b/src/reader.h
> @@ -52,6 +52,8 @@ void grammar_current_rule_empty_set (location loc);
> void grammar_current_rule_prec_set (symbol *precsym, location loc);
> void grammar_current_rule_dprec_set (int dprec, location loc);
> void grammar_current_rule_merge_set (uniqstr name, location loc);
> +void grammar_current_rule_expect_sr (int count, location loc);
> +void grammar_current_rule_expect_rr (int count, location loc);
The others use "*_set".
> void grammar_current_rule_symbol_append (symbol *sym, location loc,
> named_ref *nref);
> void grammar_current_rule_action_append (const char *action, location loc,
> diff --git a/tests/conflicts.at b/tests/conflicts.at
> index 07ff178..0ad395f 100644
> --- a/tests/conflicts.at
> +++ b/tests/conflicts.at
> @@ -1140,6 +1140,61 @@ AT_BISON_CHECK([-o input.c input.y], 1, [],
> AT_CLEANUP
>
>
> +## ------------------------------------ ##
> +## %expect in grammar rule not enough. ##
> +## ------------------------------------ ##
> +
> +AT_SETUP([%expect in grammar rule not enough])
> +
> +AT_DATA([input.y],
> +[[%token NUM OP
> +%expect 1
> +%%
> +exp: exp OP exp %expect 0 | NUM;
> +]])
> +
> +AT_BISON_CHECK([-o input.c input.y], 1, [],
> +[[input.y:4.6-25: error: shift/reduce conflicts for rule 1: 1 found, 0
> expected
We should point to the %directive rather than the rule, no?
And if there is no directive, the whole rule, indeed.
> +]])
> +AT_CLEANUP
> +
> +
> +## ------------------------------- ##
> +## %expect in grammar rule right. ##
> +## ------------------------------- ##
> +
> +AT_SETUP([%expect in grammar rule right])
> +
> +AT_DATA([input.y],
> +[[%token NUM OP
> +%expect 1
> +%%
> +exp: exp OP exp %expect 1 | NUM;
> +]])
> +
> +AT_BISON_CHECK([-o input.c input.y])
> +AT_CLEANUP
These three tests could be easily be factored to take
the 0, 1, 2 argument to pass to the first rule, and $2
would be the expected error message. This I can do :)
> +
> +
> +## ---------------------------------- ##
> +## %expect in grammar rule too much. ##
> +## ---------------------------------- ##
> +
> +AT_SETUP([%expect in grammar rule too much])
> +
> +AT_DATA([input.y],
> +[[%token NUM OP
> +%expect 1
> +%%
> +exp: exp OP exp | NUM %expect 1;
> +]])
> +
> +AT_BISON_CHECK([-o input.c input.y], 1, [],
> +[[input.y:4.19-31: error: shift/reduce conflicts for rule 2: 0 found, 1
> expected
> +]])
> +AT_CLEANUP
You don't seem to have any test for R/R conflicts. It would
probably be also useful to check the case of _resolved_ conflicts.