axiom-developer
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Axiom-developer] Lazyness in lexing Boot and Spad


From: Waldek Hebisch
Subject: Re: [Axiom-developer] Lazyness in lexing Boot and Spad
Date: Sun, 25 Feb 2007 22:08:02 +0100 (CET)

> 
> Hello,
> 
>   Axiom's implementation of Boot's and Spad source includer
> (src/boot/btincl2.boot, src/interp/cstream.boot) uses lazyness (faked
> through function "Delay").  This is also variously known as zipper.
> My quesion is how much of the complexity introduced by faking
> lazyness actually outweight the benefits?  Is the lazyness actually
> beneficial? 
> 

AFAICS the main potential of benefit of lazyness is that it allows
writing algorithm as if it operated on the whole file, but still
have "interactive" parser -- that is parser which stops reading
its input when it sees end of a syntactic construct.

This potential benefit is completly lost in Axiom: the command
line parser requires that the whole construct is on single logical
line (that is single line after joining all continuated lines).
I can guess that command line parser is non-interactive due to
backtracking and also because whitespace syntax lacks explicit
end-of-construct markes (for example parser sees that a procedure
ended only when seeing next procedure or end of file).

I must say that this whole pipeline:

scan.boot.pamphlet pile.boot.pamphlet pf2sex.boot.pamphlet
pf2atree.boot.pamphlet parini.boot.pamphlet packtran.boot.pamphlet
macex.boot.pamphlet intfile.boot.pamphlet incl.boot.pamphlet
dq.boot.pamphlet cstream.boot.pamphlet cparse.boot.pamphlet
astr.boot.pamphlet

is on my list of candidates for removal.  I would like to reuse
parts of this code, but I belive that what we need can be done
simpler.  However, rewriting this part for me is of low priority,
and I probably will not get to this before summer.

My vision of future compiler is:  
- classical lexer which generates basic syntax tree nodes (including
  line number info)
- first stage parser (build surface syntax tree) accepting very
  loose grammar
- macroexpander
- second stage parser (flattening trees and checking syntactic
  constraints which cna be checked only after macro expansion)
- typechecker, probably multipass.  In first few passes I would only
  consider interface parts, the final pass would also check bodys
  (and resolve overloading)
- final code generation

Some remarks why such structure:  I want macro expansion rather
early, because before macro expansion is it impossible to check
for syntax correctness.  Also, full syntactic information helps
with typechecking (current parser has it backwards, it uses
type information for parsing, leading sometimes to weird errors).

Also, current compiler couples code generation with type checking,
which is potentialy wastefull.  Namely, in some cases the compiler
has to do backtracking search on possible type assignments, so
it generates code and then discard it during backtracking (I am
not sure how serious this problem is -- most of the time backtracking
is very limited).  

During compilation in some cases (IIRC in chaseInferences) the compiler
uses recursive traversal on a dag (ignoring identity of nodes)
which leads to about 3m spent on a single file (samewhat surprisingly
the particular procedure I found shows pathological behaviour only one
file, on other taking only little time).

I belive that having typechecking as a separate phase operationg
essentially on the whole program would allow beter search technique
and would avoid many time sinks (currently many global data structure
are invalidated after compilation of any constructors, which is
probably responsible for nonlinear compile times for multifile
compilations).

-- 
                              Waldek Hebisch
address@hidden 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]