[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[tern] greetings everyone -- rewriting, rewriting, rewriting.
From: |
david |
Subject: |
[tern] greetings everyone -- rewriting, rewriting, rewriting. |
Date: |
Thu, 21 Nov 2002 17:33:32 -0600 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003 |
I've changed the introductory message from the mailing list to be
more positive. It is better to define TERN in terms of what it may
become than
to define it in opposition to something, especially something peachy
like the perl6 project.
It is better to define ANYTHING in terms of what is is rather than what
it is not. That's
generally true.
The idea here is to develop a general purpose rewriting system that can
be used to
prototype procedural computer language features. Ideally, the target of
the rewrites
might end up as something low-level enough that TERN could become a
competitively efficient compiler system. If this doesn't happen at
first that is okay.
We (including you, since you joined the mailing list or are reading this
in the archive)
are developing a general purpose language prototyping environment based
on the
linguistics or AI concept of "rewriting."
The TERN rewriting system will hopefully allow any procedural computer
language to be described and implemented.
What is rewriting? Rewriting is when you have a statement, and you
transform it
through "rewrite rules" until it is different -- simpler, usually -- and
then you work with
the result of the rewrite rule set. It is what makes the e-mail
transfer agent "sendmail"
so tricky to configure.
Chomskian linguists maintain that the human mind interprets
human language through rewrite rules, eventually mapping speech to "deep
structures"
which represent reality. The TERN project will attempt to eventually
deliver
a system for experimenting with computer language features by altering
rewrite
rules.
To get there we need to pass several milestones, and answer several
questions. Such as
"what is reality?" which sounds far, far too heavy, but in following the
metaphor of
Chomskian theory (as alleged in the previous paragraph) we know that
reality is what
the deep structures map to. For TERN's purposes, that reality will be a
virtual machine
with a flat coding architecture, key-value mapping, and a flexible
scalar data type, and
stacks. Or some other small set of features. The important thing about
the target
virtual machine is that the feature set is very small.
Then there are the blocking primitives, which are the syntax of the
language being
implemented with the rewrite rules. All the Algol languages use curly
braces, except
for Pascal, which uses matched "BEGIN/END" tokens because it was
optomized to make
it easy to grade. Python uses indentation. We need to come up with a
generic way to
describe blocking -- collecting strings of tokens into blocks -- or at
least a standard place
to plug the blocking action into the TERN process. I've been reading
XML documentation
for a contract I'm working on, and thinking that XML makes a perfectly
good stupidest
possible, internal-use-only intermediate code for parsing things into
before looking at
them in any greater detail.
Once we have grouped our input into blocks of expressions of tokens, or
something
else, then we are faced with the question of what do these blocks
instruct our virtual
machine to perform. This in answered by applying rewrite rules until
there is nothing
left but low-level primitives (down to the target level) and then these
are processed, by
giving them to the virtual machine that understands the language that
they have been
rewritten too.
It is also possible to defer understanding of what something is supposed
to mean until
the knowledge is required, it is also possible to tag additional meaning
(such as, "only
ever used in numeric operations" or "this will always be a method call
on an object of
type socket ") onto tokens and do some bindings earlier, or later, than
is done with
other languages. How to do this is still up in the air.
How This All Interacts With Perl:
Early TERN implementations will be written in perl, as "source filters"
that interpret
Perl and produce an equivalent subset of Perl as a target virtual
language. Slightly later
implementations might produce Inlineable C blocks. Or simply long C
programs.
The idea of "compiler as a set of rewrite rules" provides a flexible
compiler paradigm
and TERN might become a front-end to the GCC compiler system.
Once you have a compiler that uses an external set of explicit rewrite
rules rather than
procedural compilation, modifying aspects of the modeled languages may
be easy. Want a
new feature? write a rewrite rule to provide it.
How We Got Here
Consider the problem of implementing co-routines in Perl. A coroutine
is a special kind of subroutine that saves state within it between
calls. in OO systems, you can set this kind
of thing up explicitly by initializing a new object of some kind and
then repeatedly calling
a result generating method of that object. This works fine and there in
no problem with it,
but sometimes intrepid individuals like Damian Convay or Uri Guttman
would actually
find it preferable to just throw "yield $x" in there instead of "return
$x" and have the
state of all variables local to the routine saved somehow and have
execution pick up at
the next statement following the "yield" the next time the routine is
called. "Action at
a distance!" the critics wail, those that understand the implications at
least, and they are
right. Well you only have AAAD problems if there's one stash per
routine, because
a routine might get tickled from two unrelated threads. So the AAAD
problems can be
resolved by keeping a set of stashes keyed by the source of the
invocation. The problem
is, in order to get all that information in Perl, you have to wrap your
routine in a closure,
including an exit/entry point at each yield statement. In effect, you
have to provide the
initialize and later generate paradigm, but hide it all. And there were
some other
negative ramifications of using closures that I don't recall right now,
but the end result
was that in order to provide a "yield" (via source filter) that works
correctly,
the source filter has to understand and reimplement a considerable set
of blocking and
conditional statements.
So if you have to parse and rewrite to provide one feature, why not make
the general
problem Parse And Rewrite and see what other problems can be solved from
that
perspective.
My ride is waiting so I have to run along, but that's more or less what
it's about; awaiting
comments and critiques. Eventually we'll all have enough spare time to
do EVERYTHING.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [tern] greetings everyone -- rewriting, rewriting, rewriting.,
david <=