Re: Two pass scanning

help-flex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Two pass scanning

From:	Hans Aberg
Subject:	Re: Two pass scanning
Date:	Mon, 24 Nov 2003 00:29:50 +0100

(Please reply to the Help-Flex list.)
At 21:01 +0100 2003/11/23, Henrik Sorensen wrote:
>For the gnu pl1 project, I am investigating various approaches to perform two
>pass scanning and parsing.
>
>Beside the obvious choice, of reading the source files twice, has other
>approaches for two pass scanning been tried ?
>
>This is what I had in mind:
>First pass scans the source code and stores tokens and strings away, before
>returning to the parser. The second pass reads the stored tokens instead of
>the actual source code.

This only works in languages in which the token types do not depend on the
grammar context, which in many a languages is not the case. One example is
when the language admits one to define identifiers of different grammar
types, types which later should be returned to the parser. The grammar rule
will then define the token type of the identifier, and put in a lookup
table which the lexer then reads. There are more complicated context
switches in use (which in Flex can be handled by start conditions).

>Would the YY_BREAK macro, be the right place to add code that saves the
>tokens
>and strings during the first pass before returning to the parser ?

I have used limited token lookahead in order to hand over something to the
parser that looks as LALR(1) (which Bison uses to generate parsers). Then
the tokens that the lexer discovers are put in a FIFO (first in first out)
pipe and rescanned with no "return" in the Flex rules until the lookahead
has been finished. The Flex lexer must then have a code segment right after
the %% that initiates the lexer rules which checks if the pipe is
non-empty. If the pipe is non-empty, the first token is lopped off from the
pipe and returned to the parser; if it is empty, token scanning proceeds
normally. If one is using this technique, one must make sure this technique
does not screw up any context switches.

  Hans Aberg

[Prev in Thread]

Current Thread

[Next in Thread]

Two pass scanning, Henrik Sorensen, 2003/11/23
- Re: Two pass scanning, Hans Aberg <=

Prev by Date: Two pass scanning
Next by Date: Flex unput
Previous by thread: Two pass scanning
Next by thread: Flex unput
Index(es):
- Date
- Thread