[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing a language with optional spaces

From: Christian Schoenebeck
Subject: Re: Parsing a language with optional spaces
Date: Wed, 08 Jul 2020 12:15:49 +0200

On Mittwoch, 8. Juli 2020 06:24:13 CEST Akim Demaille wrote:
> > As I don't speak BASIC, let me rephrase this problem in FORTRAN IV which
> > is also "blank agnostic":
> > 
> > DO <number> <variable> = <expression> , <expression> [, <expression>]
> > 
> > It is not until you reach the comma after the first expression that you
> > know whether the statement is the beginning of a loop or it is an
> > assignment.  And the expression can contain commas in function calls,
> > which defeats any trivial lookahead scanning.  E.g.,
> > 
> > D O 17 6PQ R=FUN X(1 4, V 8)
> > 
> > is an assignment to variable DO176PQR.  The function arguments can also be
> > expressions that contain function calls.
> > 
> > As you can see, this more or less defeats any attempt to write a lex
> > scanner.  And you cannot just squeeze out all blanks in a front end
> > because "Hollerith fields" can contain blanks that are significant (must
> > remain).
> I still think you can address this case with Flex, but I agree it's
> going to be painful.  I would go for something like
> sp   [ \t]*
> do   D{sp}O
> id   [a-zA-Z]({sp}[a-zA-Z_0-9]+)*

do 10 i = 1, n

would then be interpreted as assignment to variable 'do10i', it is a loop 
definition though.

So yes, you could certainly address this to work correctly with Flex with 
additional measures, but I think both the Fortran and BASIC examples could 
much easier (less complex) and elegantly be solved with a monolithicly 
combined parser-scanner, as the parser could then out of the box detect 
keywords depending on the grammar context.

Best regards,
Christian Schoenebeck

reply via email to

[Prev in Thread] Current Thread [Next in Thread]