Hi Domingo,
Le 3 nov. 2021 à 19:27, Domingo Alvarez Duarte <mingodad@gmail.com> a écrit :
Hello !
I'm trying to build the CFRONT parser
Wow... I was unaware cfront still existed.
with bison instead of byacc and I've got bison to accept the grammar
https://github.com/mingodad/cfront-3/blob/master/src/gram.y but the generated
parser by bison has a small difference in the conflicts found in the grammar
(see bellow)
That's an interesting finding, thanks for reporting it!
FWIW, I believe we will change the way conflicts are counted in Bison. I don't
know when, and backward compatibility will have to be taken into account, but
it was found that Bison disagrees with Bison. Indeed, when looking for
counterexamples and when merely counting the number of conflicts, we don't get
the same results, and the latter should comply with the former. See
https://github.com/akimd/bison/issues/73#issuecomment-792282225.
cfront-3/src$ bison -y gram.y
gram.y: warning: 20 shift/reduce conflicts [-Wconflicts-sr]
gram.y: warning: 4 reduce/reduce conflicts [-Wconflicts-rr]
cfront-3/srv$ byacc gram.y
byacc: 21 shift/reduce conflicts, 3 reduce/reduce conflicts.
I have not tried to compare the SR conflicts (the automata are very similar,
but the state numbers are different and I don't feel like tracking the
correspondance right now). But the RR conflicted state where Bison and Byacc
differ is:
In Bison (very verbose mode, including counterexamples)
State 101
57 tname: . qualified_tname
58 | . qualified_tname lt temp_inst_parms gt
59 | . NAME lt temp_inst_parms gt
174 scope_qualifiers: . tn_list
175 tn_list: . tscope
176 | . tn_list tscope
177 qualified_tname: . tn_list TNAME
178 | . TNAME
188 decl: tname LP . MUL ID RP arg_list
189 | tname LP . AND ID RP arg_list
190 | tname LP . elist RP
191 | tname LP . RP fct_attributes
192 | tname LP . MEMPTR decl RP arg_list
266 elist: . ex_list
267 ex_list: . initializer
268 | . ex_list CM initializer
269 initializer: . e
270 | . LC elist RC
296 e: . e ASSIGN e
297 | . e PLUS e
298 | . e MINUS e
299 | . e MUL e
300 | . e AND e
301 | . e OR e
302 | . e ER e
303 | . e SHIFTOP e
304 | . e EQUOP e
305 | . e DIVOP e
306 | . e RELOP e
307 | . e LT e
308 | . e GT e
309 | . e ANDAND e
310 | . e OROR e
311 | . e ASOP e
312 | . e CM e
313 | . e QUEST e COLON e
314 | . e REFMUL e
315 | . DELETE term
316 | . DELETE LB e RB term
317 | . MEM DELETE term
318 | . MEM DELETE LB e RB term
319 | . term
320 | . THROW term
321 | %empty . [RP, LT, GT, ER, OR, ANDAND, OROR, QUEST, ASSIGN, CM, ASOP,
RELOP, EQUOP, DIVOP, SHIFTOP, REFMUL]
322 term: . NEW cast_type
323 | . NEW new_type
324 | . MEM NEW cast_type
325 | . MEM NEW new_type
326 | . term ICOP
327 | . cast_type term
328 | . MUL term
329 | . AND term
330 | . MINUS term
331 | . PLUS term
332 | . NOT term
333 | . COMPL term
334 | . ICOP term
335 | . SIZEOF term
336 | . SIZEOF cast_type
337 | . term LB e RB
338 | . term REF prim
339 | . term REF MEMQ prim
340 | . term REF MEMQ TNAME
341 | . term REF dtorspec
342 | . term REF scope_qualifiers prim
343 | . term REF scope_qualifiers TNAME
344 | . term DOT prim
345 | . term DOT MEMQ prim
346 | . term DOT MEMQ TNAME
347 | . term DOT dtorspec
348 | . term DOT scope_qualifiers prim
349 | . term DOT scope_qualifiers TNAME
350 | . prim
351 | . scope_qualifiers prim
352 | . tn_list COMPL tag
353 | . tn_list COMPL TYPE
354 | . term_elist
355 | . term_lp e RP
356 | . ZERO
357 | . ICON
358 | . FCON
359 | . STRING
360 | . CCON
361 | . THIS
370 term_elist: . TYPE LP elist RP
371 | . tname LP elist RP
372 | . NEW term_lp elist RP cast_type
373 | . NEW term_lp elist RP new_type
374 | . MEM NEW term_lp elist RP cast_type
375 | . MEM NEW term_lp elist RP new_type
376 | . term LP elist RP
377 ptname: . PTNAME lt temp_inst_parms gt
378 tscope: . TSCOPE
379 | . MEM
380 | . ptname TSCOPE
381 prim: . ID
382 | . OPERATOR oper
383 | . OPERATOR c_type
384 cast_type: . term_lp type cast_decl RP
385 term_lp: . LP
395 arg_lp: LP . [ENUM, RP, CM, NAME, TYPE, TNAME, ELLIPSIS, AGGR, MEM,
TSCOPE, DECL_MARKER, LINKAGE, PTNAME]
DELETE shift, and go to state 154
NEW shift, and go to state 155
OPERATOR shift, and go to state 156
SIZEOF shift, and go to state 157
THIS shift, and go to state 158
LP shift, and go to state 159
RP shift, and go to state 195
NOT shift, and go to state 160
COMPL shift, and go to state 161
MUL shift, and go to state 196
AND shift, and go to state 197
PLUS shift, and go to state 164
MINUS shift, and go to state 165
LC shift, and go to state 198
ID shift, and go to state 166
STRING shift, and go to state 167
ICON shift, and go to state 168
FCON shift, and go to state 169
CCON shift, and go to state 170
NAME shift, and go to state 12
ZERO shift, and go to state 171
ICOP shift, and go to state 172
THROW shift, and go to state 174
MEMPTR shift, and go to state 200
PTNAME shift, and go to state 22
ENUM reduce using rule 395 (arg_lp)
RP [reduce using rule 321 (e)]
RP [reduce using rule 395 (arg_lp)]
CM reduce using rule 321 (e)
CM [reduce using rule 395 (arg_lp)]
NAME [reduce using rule 395 (arg_lp)]
TYPE reduce using rule 395 (arg_lp)
TNAME reduce using rule 395 (arg_lp)
ELLIPSIS reduce using rule 395 (arg_lp)
AGGR reduce using rule 395 (arg_lp)
MEM reduce using rule 395 (arg_lp)
TSCOPE reduce using rule 395 (arg_lp)
DECL_MARKER reduce using rule 395 (arg_lp)
LINKAGE reduce using rule 395 (arg_lp)
PTNAME [reduce using rule 395 (arg_lp)]
$default reduce using rule 321 (e)
tname go to state 201
scope_qualifiers go to state 182
tn_list go to state 202
qualified_tname go to state 39
elist go to state 203
ex_list go to state 204
initializer go to state 205
e go to state 206
term go to state 185
term_elist go to state 186
ptname go to state 53
tscope go to state 42
prim go to state 187
cast_type go to state 188
term_lp go to state 189
Conflict between rule 321 and token MUL resolved as shift (NO_EXPR < MUL).
Conflict between rule 321 and token AND resolved as shift (NO_EXPR < AND).
Conflict between rule 321 and token PLUS resolved as shift (NO_EXPR <
PLUS).
Conflict between rule 321 and token MINUS resolved as shift (NO_EXPR <
MINUS).
Conflict between rule 395 and token TYPE resolved as reduce (TYPE < LP).
Conflict between rule 395 and token TNAME resolved as reduce (TNAME < LP).
Conflict between rule 395 and token MEM resolved as reduce (%left MEM).
Conflict between rule 395 and token TSCOPE resolved as reduce (TSCOPE <
LP).
shift/reduce conflict on token RP:
321 e: %empty .
191 decl: tname LP . RP fct_attributes
Example: tname LP . RP
Shift derivation
decl
`-> 191: tname LP . RP fct_attributes
`-> 191: %empty
Example: tname LP . RP
Reduce derivation
decl
`-> 190: tname LP elist RP
`-> 266: ex_list
`-> 267: initializer
`-> 269: e
`-> 321: %empty .
reduce/reduce conflict on tokens RP, CM:
321 e: %empty .
395 arg_lp: LP .
Example: tname LP . RP
First reduce derivation
decl
`-> 190: tname LP elist RP
`-> 266: ex_list
`-> 267: initializer
`-> 269: e
`-> 321: %empty .
Example: tname LP . RP
Second reduce derivation
decl
`-> 186: tname arg_list
`-> 396: arg_lp arg_type_list ellipsis_opt
RP fct_attributes
`-> 395: LP . `-> 396: %empty `-> 396: %empty
`-> 396: %empty
shift/reduce conflict on token RP:
395 arg_lp: LP .
191 decl: tname LP . RP fct_attributes
Example: tname LP . RP
Shift derivation
decl
`-> 191: tname LP . RP fct_attributes
`-> 191: %empty
Example: tname LP . RP
Reduce derivation
decl
`-> 186: tname arg_list
`-> 396: arg_lp arg_type_list ellipsis_opt
RP fct_attributes
`-> 395: LP . `-> 396: %empty `-> 396: %empty
`-> 396: %empty
shift/reduce conflict on token NAME:
395 arg_lp: LP .
59 tname: . NAME lt temp_inst_parms gt
First example: tname LP . NAME lt temp_inst_parms gt LP elist RP RP
ASSIGN initializer SM EOFTOK
Shift derivation
$accept
`-> 0: ext_def
EOFTOK
`-> 1: external_def
`-> 21: fct_dcl
`-> 23: decl
ASSIGN initializer SM
`-> 190: tname LP elist
RP
`-> 266: ex_list
`-> 267:
initializer
`->
269: e
`-> 319: term
`-> 354: term_elist
`-> 371: tname LP elist RP
`-> 59: . NAME lt temp_inst_parms gt
Second example: tname LP . NAME lt temp_inst_parms gt arg_decl
ellipsis_opt RP fct_attributes arg_dcl_list check_inline @5 base_init block
EOFTOK
Reduce derivation
$accept
`-> 0: ext_def
EOFTOK
`-> 1: external_def
`-> 20: fct_def
`-> 30: decl
arg_dcl_list check_inline @5 base_init block
`-> 186: tname arg_list
`-> 396: arg_lp
arg_type_list
ellipsis_opt RP fct_attributes
`-> 395: LP .
`-> 398: at
`-> 399: arg_type
`-> 392: type arg_decl
`-> 67: tp
`-> 62: tname
`-> 59: NAME lt temp_inst_parms gt
shift/reduce conflict on token PTNAME:
395 arg_lp: LP .
377 ptname: . PTNAME lt temp_inst_parms gt
First example: tname LP . PTNAME lt temp_inst_parms gt TSCOPE COMPL tag
RP ASSIGN initializer SM EOFTOK
Shift derivation
$accept
`-> 0: ext_def
EOFTOK
`-> 1: external_def
`-> 21: fct_dcl
`-> 23: decl
ASSIGN initializer SM
`-> 190: tname LP elist
RP
`-> 266: ex_list
`-> 267:
initializer
`->
269: e
`-> 319: term
`-> 352: tn_list
COMPL tag
`-> 175: tscope
`-> 380: ptname
TSCOPE
`-> 377: . PTNAME lt temp_inst_parms gt
Second example: tname LP . PTNAME lt temp_inst_parms gt TSCOPE
DECL_MARKER arg_decl ellipsis_opt RP fct_attributes arg_dcl_list check_inline
@5 base_init block EOFTOK
Reduce derivation
$accept
`-> 0: ext_def
EOFTOK
`-> 1: external_def
`-> 20: fct_def
`-> 30: decl
arg_dcl_list check_inline @5 base_init block
`-> 186: tname arg_list
`-> 396: arg_lp
arg_type_list
ellipsis_opt RP fct_attributes
`-> 395: LP .
`-> 398: at
`-> 399: arg_type
`-> 392: type
arg_decl
`-> 67: tp
`-> 63: tn_list
DECL_MARKER
`-> 175: tscope
`-> 380: ptname
TSCOPE
`-> 377: PTNAME lt
temp_inst_parms gt
The disable reductions are:
RP [reduce using rule 321 (e)]
RP [reduce using rule 395 (arg_lp)]
CM [reduce using rule 395 (arg_lp)]
NAME [reduce using rule 395 (arg_lp)]
PTNAME [reduce using rule 395 (arg_lp)]
If we focus on the actions on these lookaheads, we have:
RP shift, and go to state 195
RP shift, and go to state 195
NAME shift, and go to state 12
PTNAME shift, and go to state 22
RP [reduce using rule 321 (e)]
RP [reduce using rule 395 (arg_lp)]
CM reduce using rule 321 (e)
CM [reduce using rule 395 (arg_lp)]
NAME [reduce using rule 395 (arg_lp)]
PTNAME [reduce using rule 395 (arg_lp)]
So we can see an "easy" RR conflict here:
CM reduce using rule 321 (e)
CM [reduce using rule 395 (arg_lp)]
But I agree with Bison that there is a second conflict:
RP [reduce using rule 321 (e)]
RP [reduce using rule 395 (arg_lp)]
It turns out that both of them are also involved in a SR conflict with the
shift action on RP, but I fail to see how this would not be a RR conflict.
BTW, counterexample generation "factors" these two conflicts into a single item:
reduce/reduce conflict on tokens RP, CM:
Byacc's verbose output is:
131: shift/reduce conflict (shift 231, reduce 321) on RP
131: shift/reduce conflict (shift 231, reduce 395) on RP
131: reduce/reduce conflict (reduce 321, reduce 395) on CM
131: shift/reduce conflict (shift 12, reduce 395) on NAME
131: shift/reduce conflict (shift 22, reduce 395) on PTNAME
state 131
decl : tname LP . MUL ID RP arg_list (188)
decl : tname LP . AND ID RP arg_list (189)
decl : tname LP . elist RP (190)
decl : tname LP . RP fct_attributes (191)
decl : tname LP . MEMPTR decl RP arg_list (192)
arg_lp : LP . (395)
e : . (321)
DELETE shift 153
NEW shift 154
OPERATOR shift 155
SIZEOF shift 156
THIS shift 157
LP shift 158
RP shift 231
NOT shift 159
COMPL shift 160
MUL shift 232
AND shift 233
PLUS shift 163
MINUS shift 164
LC shift 214
ID shift 165
STRING shift 166
ICON shift 167
FCON shift 168
CCON shift 169
NAME shift 12
ZERO shift 170
ICOP shift 171
THROW shift 173
MEMPTR shift 234
PTNAME shift 22
ENUM reduce 395
LT reduce 321
GT reduce 321
ER reduce 321
OR reduce 321
ANDAND reduce 321
OROR reduce 321
QUEST reduce 321
ASSIGN reduce 321
CM reduce 321
ASOP reduce 321
RELOP reduce 321
EQUOP reduce 321
DIVOP reduce 321
SHIFTOP reduce 321
TYPE reduce 395
TNAME reduce 395
ELLIPSIS reduce 395
AGGR reduce 395
MEM reduce 395
TSCOPE reduce 395
DECL_MARKER reduce 395
REFMUL reduce 321
LINKAGE reduce 395
initializer goto 223
ex_list goto 224
elist goto 235
e goto 216
term goto 178
prim goto 179
term_elist goto 180
cast_type goto 181
tscope goto 37
tn_list goto 196
scope_qualifiers goto 184
qualified_tname goto 40
tname goto 197
ptname goto 53
term_lp goto 188
Since it does not display the disabled actions, I can't tell how it's counting.
But
131: reduce/reduce conflict (reduce 321, reduce 395) on CM
should also apply to RP, and that would make two RR conflicts, not just one.
A scaled down grammar showing a similar issue is
%%
exp: a 'x' | b 'x' | a 'y' | b 'y' | 'a' 'x' 'y'
a: 'a'
b: 'a'
where Bison diagnoses 1 SR and 2 RR but generates counterexamples for 2 SR and
1 RR (on two tokens):
State 1
5 exp: 'a' . 'x' 'y'
6 a: 'a' . ['x', 'y']
7 b: 'a' . ['x', 'y']
'x' shift, and go to state 5
'x' [reduce using rule 6 (a)]
'x' [reduce using rule 7 (b)]
'y' reduce using rule 6 (a)
'y' [reduce using rule 7 (b)]
$default reduce using rule 6 (a)
shift/reduce conflict on token 'x':
6 a: 'a' .
5 exp: 'a' . 'x' 'y'
First example: 'a' . 'x' 'y' $end
Shift derivation
$accept
`-> 0: exp $end
`-> 5: 'a' . 'x' 'y'
Second example: 'a' . 'x' $end
Reduce derivation
$accept
`-> 0: exp $end
`-> 1: a 'x'
`-> 6: 'a' .
reduce/reduce conflict on tokens 'x', 'y':
6 a: 'a' .
7 b: 'a' .
Example: 'a' . 'x'
First reduce derivation
exp
`-> 1: a 'x'
`-> 6: 'a' .
Example: 'a' . 'x'
Second reduce derivation
exp
`-> 2: b 'x'
`-> 7: 'a' .
shift/reduce conflict on token 'x':
7 b: 'a' .
5 exp: 'a' . 'x' 'y'
First example: 'a' . 'x' 'y' $end
Shift derivation
$accept
`-> 0: exp $end
`-> 5: 'a' . 'x' 'y'
Second example: 'a' . 'x' $end
Reduce derivation
$accept
`-> 0: exp $end
`-> 2: b 'x'
`-> 7: 'a' .
and byacc generates
1: shift/reduce conflict (shift 5, reduce 6) on 'x'
1: shift/reduce conflict (shift 5, reduce 7) on 'x'
1: reduce/reduce conflict (reduce 6, reduce 7) on 'y'
state 1
exp : 'a' . 'x' 'y' (5)
a : 'a' . (6)
b : 'a' . (7)
'x' shift 5
'y' reduce 6
It does not count the RR conflict on 'x'.
Cheers!