[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MIT-Scheme-users] associating environments with Scheme file buffers

From: Taylor Campbell
Subject: Re: [MIT-Scheme-users] associating environments with Scheme file buffers in Edwin
Date: Fri, 13 Jan 2006 00:02:09 -0000
User-agent: IMAIL/1.21; Edwin/3.116; MIT-Scheme/7.7.90.+

   Date: Wed, 14 Sep 2005 16:16:20 -0400
   From: Chris Hanson <address@hidden>

      Date: Wed, 14 Sep 2005 05:36:05 +0000 (UTC)
      From: Taylor Campbell <address@hidden>

      Can you elaborate on the ideas you have?

   There are a few principles I have in mind.  These seem to be almost
   orthogonal to what you have been concerned about, though I'll have to
   read your document more carefully and think about it.

   * The primary function of the module system is to link code together,
     by associating names in different code fragments.


   * Names are linked together by sharing a value cell.

What about macros?  Also, when you say 'sharing a value cell,' do you
imply that mutation is shared, and that it can be effected by either
the defining or using modules?  I think that prohibiting assignment to
imported variables *dramatically* improves reasoning about code, and
conduces, among other things, compiler optimization which would have
previously been disallowed because constancy of variables' values and
their properties could not be relied upon.  (Actually, I'm of the
opinion that variables should be immutable anyway and that mutation
should be only of data, i.e. 'mutable variables' would be expressed by
variables whose values are mutable cells.  But this is not a popular
opinion in the Scheme world, I think.)

   * Top-level environments are an internal artifact of the module
     system.  They need not exist at run time, and the debugger may or
     may not choose to present them to the user.

Right.  Of course, they should still be provided for meta-code, i.e.
code that reasons about other code, such as code that invokes the
syntaxer or compiler.

   * An "interface" is essentially an annotated set of names.  (See the
     attached document "modules.html".)  The annotation provides
     information about the values associated with the names, for the use
     of the compiler and other tools.


   * A code fragment has an associated interface that is derived from the
     source code.  This interface can be transformed according to some
     simple rules (see "modules.html"), either by declarations in the
     source or externally.  The code fragment, along with its (possibly
     transformed) interface is an "implementation".

I don't entirely agree with this design here.  I have found it useful,
in Scheme48, to provide interfaces with multiple implementations (i.e.
define some interface with DEFINE-INTERFACE and then define several
modules (called structures in Scheme48) with that interface) and also
to offer multiple interfaces to a single implementation (using the
DEFINE-STRUCTURES (plural) form).  In Scheme48, then, there is no
unique interface associated with a code fragment (which corresponds, I
think, with what Scheme48 calls packages); rather, a structure is a
duple of an interface and a package, and there may be any number of
structures with a shared interface & varying package or a shared
package & varying interface.

   * Specification, transform, and other manipulations of interfaces are
     a function of the linker.  There should be a linking language that
     supports these activities.

Yes; this corresponds, I believe, with Scheme48's config language.

   Here is my quick take on your document:

   (2) Environment control

           yes, in general, though I don't agree with the specifics

Can you elaborate on this a bit?

   (3) Compilation control & macro semantics

           I see this as orthogonal to the problem of naming/linking

I no longer remember what I was referring to with compilation control,
but phase-separated macro semantics are very, very important in the
module system, since the module system controls lexical environments,
and the visibility of names and their values for run-time code, macro
code, macro's macro code, &c., must be clearly specified.  Matthew
Flatt wrote a fairly influential paper on this, though I disagree with
the specifics of his module system, whereby every module must be
instantiated for every phase at which it is used.  Scheme48 gets this
right, in my opinion: there is only one instantiation of every module,
but the environment and evaluation of each storey in the tower of
phases of any single module is clearly isolated.  I'm not sure whether
Scheme48's official manual describes the mechanism, but mine, which is
at <>, does; I recommend that
you read the node 'Macros in concert with modules' if you're curious
about Scheme48's design & rationale in this area.

   (4) Feature control/system distribution

           I don't entirely understand this one

Since I wrote module.text, I've realized that this is a separate
matter which can be designed in isolation, so never mind about this.

   (7) Abstraction at the module level: parameterized modules

           umm, I think so.  But I'd probably deal with this by
           programming in the linking language.

Yes, this would be provided by the module language.  For example,
Scheme48 provides a DEFINE-MODULE form in its configuration language:

  (define-module (make-foo bar)
    (define-structure foo foo-interface
      (open scheme bar ...)
      (files foo))

One can then instantiate foos with (DEF FOO/FROB (MAKE-FOO FROB)),

   (8) Mutual reference

           mutual reference is certainly required, but I'm not sure a
           theory is needed.

I think it's not really straightforward how mutual reference would
interact with a hygienic macro system, particularly one with clear
phase semantics.  I haven't thought very hard about the issue, though,
and I usually try to avoid mutually referential modules anyway (even
where it is supported, like in T or MIT Scheme; though it's not in
Scheme48), which is why I just tossed in the vague comment 'a theory
must be developed.'

   (9) Purely declarative language

           maybe.  I'd have to decide this one after playing with it for
           a while.

I think it is important to be able to describe the organization of a
program in a declarative manner.  This, as Jonathan Rees & Richard
Kelsey have attested, has helped to improve the organization of the
Scheme48 system, and it also allows for tools outside of the Scheme
system to analyze large systems' organizations.  MIT Scheme's package
system already satisfies this, by the way, I believe, at least in the
.pkg files, though information about syntaxation & compilation is not
declarative in the same way.

   (11) Language generality & reader customization

           more or less.  the linker operates on binary objects, not
           source, so some of this is moot.

I'm not really sure what I had in mind when I wrote that one, so I
have nothing further to say about it, other than that the reader
customization bit relates to the integration between the module system
and code processors.

   (13) File system interaction

           I don't have a problem with modules consisting only of whole
           files, as long as they can contain multiple files.  But since
           the linker deals with binary objects, that's a separable
           issue.  If the compiler wants to support multiple binaries
           froma single file, the linker doesn't need to know.

What I meant by this is simply that the module system should not be
imposed upon by the layout of the file system: some modules might be
implemented in multiple files, while others might have no files at
all.  This is already the case in MIT Scheme's & Scheme48's module
systems.  However, the module system should also be integrated with
code transducers like compilers, i.e. it shouldn't be necessary to
manually invoke the compiler (or syntaxer) on files, but it should
rather accept modules (or packages, or whatever the term is) and
process the source pointed to by their associated filenames.  This is
the case in Scheme48, but not, as far as I can tell, MIT Scheme.

   (14) No meta-modular complexity

           I don't understand this.

By 'meta-modular complexity,' I mean writing something like this in
Scheme48's module system:

  (define-structure foo-structure (export (foo :structure))
    (open module-system built-in-structures srfi-structures)
    (begin (define-interface foo-interface (export ...))
           (define-structure foo foo-interface

For small systems, this makes for a rather lot of boilerplate
overhead; it should generally be avoided if possible.

...and now for something completely different:

      Sorry, perhaps I ought to have clarified there: I was referring to a
      general -*- line parser for any Edwin library to utilize, not just one
      more specialized local variable (mode) that Edwin internally happens
      to recognize.  E.g., there might be a (DEFINE-*-LINE-HANDLER name
      procedure) procedure, with which one might implement the 'package' (or
      'environment') local variable like so: (DEFINE-*-LINE-HANDLER 'PACKAGE

   That sounds reasonable.

Various things have distracted me about this since I brought it up on
the list; today I finally got around to writing an implementation of
this, which is at <>,
lightly tested.  I changed it from 'package' to 'mit-scheme-package'
because, as I learned recently, giving buffer-local variables names
like 'package' is a bad idea, at least for Emacs, in case you want to
write code like

  (defun foo (buffer package)
    (with-current-buffer buffer ...package...))

This doesn't affect Edwin, but it makes things painful for elisp-based
Emacsen.  (Actually, I imagine -*- Mode: Scheme; Package: (EDWIN) -*-
& the like would be OK for Emacs, because (except for the mode) all
buffer-local variables in Emacs are case-sensitive, and it looks
nicer, too.  But that is a minor matter.)

I have found it useful, also, to patch EVAL-EXPRESSION & EVAL-REGION
as I do in <>, which
is my .edwin file, so that it doesn't send anything to the inferior
REPL if the current buffer's evaluation environment differs from that
of the REPL buffer.

      Also, can you elaborate a little on the URI-based naming scheme you
      had in mind for the module system?  Having HTTP URLs name packages
      seems a little strange to me; I like the system of lists of names,
      but perhaps there's a more general reason.

   I've been working on persistence in the context of web services.  I'd
   like to have a naming structure that scales to shared "environments"
   all over the world.

Do you have anything concrete to expound on here?

reply via email to

[Prev in Thread] Current Thread [Next in Thread]