[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Wed, 04 Jul 2001 07:12:51 +1000
I got a copy of the paper on DR grammars (Fortes) yesterday. His
idea is to reuse some of the concepts of precendence grammers to
make parse tables smaller.
An LR(1) parse table is typically about 98% empty space ie
200,000 entries of which 4,000 are used, for a typical language
like C. The size is number of states times number of token types.
LR(1) has many more states than LALR(1) thus the problem with the
On the other hand, LALR(1) tends to create spurious S/R and/or
R/R conflicts that are often quite hard to understand and often
quite tedious to remove. Your grammar code ends up being bigger
and uglier. The bison manual refers to two hacks to avoid these
spurious conflicts a) Add dummy tokens to help bison understand
that the two situations are different b) Explode the productions
do that the spaces of of the two confused productions
The basic idea of DR is that you start with the lookahead and
walk down the stack until you have certainlty about what to do.
The parse tables tell you when to stop looking for the next
action. Fortes claims that typically you only have to go 0,1,2,3
levels into the stack so in practice the parser is fast. And the
main benefit is that the tables are much smaller.
However it is not just a different way of generating the bison
tables - the parse code will have to be different. Error handling
workks somewhat differently (you may find errors later than with
At this stage I tend to think that the best solution for the
table size in LR(1) is just to put the entries into a hash table.
This will be slower but should often be fast enough. Eg if you
look at GCC the parse time is negligible as a percentage.
- DR Grammars,
Tim Josling <=