help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Changing the lookahead token


From: David Kastrup
Subject: Re: Changing the lookahead token
Date: Mon, 17 Sep 2012 11:15:03 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2.50 (gnu/linux)

Hans Aberg <address@hidden> writes:

> Hi David!
>
> On 16 Sep 2012, at 10:51, David Kastrup wrote:
>
>> But that's just guessing.  Are there any hard or soft criteria about
>> when it may or may not be allowed to pull the lookahead token out from
>> under Bison and put something else there?
>
> You might look at the push parser, which allows one to call it
> whenever a token is available. Then LALR(1) compacts the states so
> that additional reductions may be called before an error is issued;
> possibly something similar may happen in your case.

I am not saying that anything is happening in my case: I am _asking_
what will be happening.  I am not keen on investing weeks of work
reworking the grammar and then figuring out that my approach was doomed
from the start.  It is clear that the lookahead token does not creep
into the stack.  It is also clear that it is responsible for causing
reductions, and the reductions have actions which I can use to swap the
lookahead token.  The documentation section "Action features" states:

 -- Variable: yychar
     Variable containing either the lookahead token, or `YYEOF' when the
     lookahead is the end of the input stream, or `YYEMPTY' when no
     lookahead has been performed so the next token is not yet known.
     Do not modify `yychar' in a deferred semantic action (*note GLR
     Semantic Actions::).  *Note Lookahead Tokens: Lookahead.

"Do not modify `yychar' in a deferred semantic action." implies that it
is possible to modify yychar in a normal semantic action.  However, it
is quite unclear what the effects of such a modification will be.

It is not clear what lasting effect yychar might have on the parser
state that might preclude some modifications.

> One way around it is to try to rewrite the grammar, so that the tokens
> are not needed.

Not possible since the respective arguments are not delimited and may
continue indefinitely.

c   c-.   c-.-^

are all valid music arguments.  So I need the lookahead token to decide
where the argument may end, and the lexical class of the result is only
decided upon executing user-definable code.  So I basically need to push
information before the lookahead token.  I have a stack for that in the
lexer and already use it in cases where I don't have a lookahead token.
The question is whether I can push the lookahead token there when it has
already triggered reductions.

> Another might be to use LR(1) which makes sure actions are applied as
> dictated by the tokens.

I want to avoid going there.  There is also the possibility of calling
the parser recursively and keeping the outer instance free from
lookahead.  However, the YYLTYPE and YYSTYPE variables have non-zero
initialization cost, so setting up and obliterating automatic arrays of
them (the stack) frequently is costly.

So it would have been nice to know when one can modify yychar and with
what effect.  And since it is documented when you should _not_ modify
it, apparently the person writing the documentation had some idea about
the circumstances under which modifying it is feasible.

-- 
David Kastrup



reply via email to

[Prev in Thread] Current Thread [Next in Thread]