guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] add language/wisp to Guile?


From: Marc Nieper-Wißkirchen
Subject: Re: [PATCH] add language/wisp to Guile?
Date: Mon, 27 Feb 2023 08:26:47 +0100

Am Mo., 27. Feb. 2023 um 00:22 Uhr schrieb Philip McGrath
<philip@philipmcgrath.com>:
>
> Hi,
>
> On Sunday, February 26, 2023 6:02:04 AM EST Marc Nieper-Wißkirchen wrote:
> > Am So., 26. Feb. 2023 um 08:46 Uhr schrieb <guile-devel-request@gnu.org>:
> > > Message: 1
> > > Date: Sun, 26 Feb 2023 02:45:12 -0500
> > > From: "Philip McGrath" <philip@philipmcgrath.com>
> > > To: "Maxime Devos" <maximedevos@telenet.be>, Ludovic Courtès
> > >
> > >         <ludo@gnu.org>, "Matt Wette" <matt.wette@gmail.com>,
> > >         guile-devel@gnu.org
> > >
> > > Cc: "Christine Lemmer-Webber" <cwebber@dustycloud.org>
> > > Subject: Re: [PATCH] add language/wisp to Guile?
> > > Message-ID: <981b0e74-96c0-4430-b693-7fc8026e3ead@app.fastmail.com>
> > > Content-Type: text/plain;charset=utf-8
> >
> > [...]
> >
> > I would like to make two remarks, which I think are essential to get
> > the semantics right.
> >
> > The R6RS comments of the form "#!r6rs" are defined to modify the
> > lexical syntax of the reader; possibly, they don't change the language
> > semantics (after reading).  In particular, "#!r6rs" also applies to
> > data files but does not affect the interpretation of the data after it
> > is read. It cannot because the reader otherwise ignores and does not
> > report comments.
> >
> > Thus a comment of the form "#!r6rs" may be suitable for Wisp, but it
> > is not a substitute for Racket's "#lang" (or a similar mechanism).
> > Guile shouldn't confuse these two different levels of meaning.
> >
>
> I agree that it's important to distinguish between lexical syntax (`read`) and
> the semantics of what is read.
>
> However, Racket's `#lang` in fact operates entirely at the level of `read`.
> (Racketeers contribute to confusion on this point by using `#lang` as a
> shorthand for Racket's entire language-creation infrastructure, when in fact
> `#lang` specifically has a fairly small, though important, role.) When `read`
> encounters `#lang something`, it looks up a reader extension procedure in the
> module indicated by `something` and uses that procedure to continue parsing
> the input stream into data. Importantly, while syntax objects may be used to
> attach source location information, there is no "lexical context" or binding
> information at this stage, as one familiar with syntax objects from macro
> writing might expect: those semantics come after `read` has finished parsing
> the input stream from bytes to values.

[...]

Thank you for the reminder on Racket's #lang mechanism; it is a long
time ago since I wrote some #lang extensions myself when experimenting
with Racket.

Nevertheless, I am not sure whether it is relevant to the point I
tried to make.  The "#!r6rs" does not indicate a particular language
(so tools scanning for "#!r6rs" cannot assume that the file is indeed
an R6RS program/library).  In an implementation that supports, say,
R6RS and R7RS, "#!r6rs" can only switch the lexical syntax but cannot
introduce forms that make the implementation change the semantics from
R7RS to R6RS, e.g., in the case of unquoted vector literals.

(It must be compatible with calling the procedures "read" and "eval"
directly, so "#!r6rs" must not wrap everything in some module form,
say.)

Racket's "#lang" mechanism has more freedom (regardless of how it is
implemented).

Of course, R6RS gives implementations the freedom to modify the reader
in whatever way after, say, "#!foo-baz" was read.  Thus, "#!foo-baz"
could be defined to work like Racket's "#lang foo-baz," reading the
rest of the source as "(module ...)".  But as long as we stay within
the confines of R6RS, this will only raise an undefined exception
because, in general, "module" is not globally bound.

I don't want to contradict you; I just mean that a plain "#!r6rs"
without a top-level language where "module" is bound is not equivalent
to "#lang" and that trying to switch to, say,  Elisp mode with
"#!elisp" would leave the boundaries of the Scheme reports (and when
this is done, this specific discussion is moot).

[...]

> > The second comment concerns the shebang line in R6RS scripts (as
> > described in the non-normative appendices).  The shebang line is not a
> > comment in the R6RS lexical syntax; it does not even reach the reader
> > - at least, conceptionally.  The Scheme reader only sees the lines
> > following the shebang line.
> >
> > For example, a conforming R6RS implementation must raise an exception
> > when trying to read (using get-datum, for example) a file that begins
> > with a shebang line.
> >
> > Thus, the shebang line doesn't need to be considered when discussing
> > comment formats in lexical syntax.
> >
>
> This is a very persuasive account of the R6RS appendices. I just find the
> approach somewhat unsatisfying. An R6RS implementation with script support
> must have a procedure `not-quite-read` that handles a potential shebang line
> before calling `read`. I wish this `not-quite-read` procedure were made
> available from some Scheme library (and perhaps somewhat more explicitly
> specified), and I'd probably find it most beautiful for this `not-quite-read` 
> to
> be unified with `read`. But that's not really relevant per se.

The R6RS approach is the sound one.  The shebang line is interpreted
by the kernel, which only sees the binary file.  The Scheme reader, on
the other hand, operates on a textual file.  So the logically correct
way to implement script support is to open a binary port, and check
whether the file starts with the bytes (!) corresponding to "#!/" or
"#! /" and, if so, skips bytes until #\newline is seen.  Only then it
changes the binary port into a textual port (using whatever encoding
the user may have specified) and uses the Scheme reader.

This doesn't mean that there is no room for a procedure "read-script"
that takes a binary port/filename.  It just cannot and shouldn't be
merged with "read".  Another reason why this is not possible is that
Scheme's lexical as defined in the R6RS does not include shebangs as
possible tokens ("#!<delimiter>" is not a valid token, for example).


>
> >
> > Best,
> >
> > Marc
>
> Thank you for these thought-provoking remarks!
>
> Philip
>
> [1]: https://www-old.cs.utah.edu/plt/publications/macromod.pdf
> [2]: https://jeapostrophe.github.io/home/static/icfp065-mccarthy.pdf



reply via email to

[Prev in Thread] Current Thread [Next in Thread]