emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Update 1 on Bytecode Offset tracking


From: Stefan Monnier
Subject: Re: Update 1 on Bytecode Offset tracking
Date: Fri, 17 Jul 2020 18:08:34 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

>> While waiting for the paperwork to go through, you can prepare the patch
>> and we can start discussing it.
> Sure, does that just mean the 'git format-patch -1' emailed to
> bug-gnu-emacs@gnu.org, as mentioned in CONTRIBUTE? If that's the gist of
> it then I can do that shortly.

Pretty much, yes.  You can add some text to give extra background on the
design, the motivation for some of the choices, or ask questions about
particular details, but that's not indispensable.

You can also send an email that just refers to a branch in emacs.git.
But for the discussion to work well, it's usually better to make sure
this branch is "small" so people aren't discouraged to read the large
diff ;-)

> I was able to speed that function up to the point that it's about the
> same as one using `read`. Those functions are doing a whole lot of IO
> (reading and writing hundreds of files) so it's not really a fair
> comparison. I've done more tests with functions that just read a whole
> buffer, collecting what they read into a list. In a 9600 line file with
> just over 500 sexps, the `read` version took about ~.02-.04 seconds
> (according to `benchmark-run-compiled`), and the `source-map-read`
> version took ~.08 seconds when it didn't GC, but unlike with `read` it
> did cause a GC 10-20% of the time.

IME when the time is in the sub-second range the measurements are very
imprecise, so better measure the time to repeat the same `read` N times
so the total time is a few seconds (and since it's the same `read`,
it won't suffer from extra IO overhead).

>> For macros, OTOH, it's really fundamentally hard (or impossible, in
>> general).
> Helmut Eller mentioned before that most macros do use at least some of
> the original code in their expansion.

We can definitely hope to use some heuristics that will preserve "most"
source info for "most" existing macros, yes.
But it's still a fundamentally impossible problem in general ;-)

>> We could/should introduce some new way to define macros which
>> knows about "source code annotated with locations".
> I've wondered about this too but don't know what the right approach
> would be.

The first step is to define a `defmacro2` which works like `defmacro`
but is defined to take as arguments (and to return) annotated-sexps
instead of "bare sexps".  It'll be less convenient to use, but

In Scheme "annotated sexps" are called "syntax objects".

> I doubt anyone would want to use something like macro-cons/list/append
> etc. functions,

Scheme avoids the problem by defining additional higher-level layers,
where macros are defined in a more restrictive way using templates, so
for most macros the programmer doesn't need to use care very much about
the difference between bare sexps and syntax objects.

The main motivation for it was hygiene (the framework takes care of
adding the needed `gensym`s where applicable) rather than tracking
source-location, but fundamentally the issue is the same: an AST node is
not just some random sexp.

IOW "code and data aren't quite the same, after all" ;-)

See for example `syntax-case` 
https://www.gnu.org/software/guile/manual/html_node/Syntax-Case.html
Note that Scheme uses the #' notation for syntax objects.  Adapting the
example for `when` to an Elisp syntax could look like:

    (defmacro2 when (form)
      (elisp-case form
        ((_ test e e* ...) (elisp (if test (progn e e* ...))))))

[ Where I used `elisp` instead of Scheme's `syntax` since we already use
  the prefix "syntax-" for things related to syntax-tables.  ]

Notice how it's `elisp-case` which extracts `test`, `e`, and `e*` and
then it's `syntax` which builds the new chunk of code, so all the
replacement of `car` with `elisp-car` can be hidden within the definition
of `elisp-case` and `elisp`.

>> There's a lot of work on Scheme macros we could leverage for that.
> Interesting, so far I've had some difficulty finding documentation about
> how other Lisps track source locations.

It's not really discussed, but the distinction between "sexp" and
"syntax object" is the key.  It's largely not discussed because Scheme
macros have never officially included the equivalent of `defmacro`
operating on raw sexps, so they've never really had to deal with the
issue (tho Gambit does provide a `define-macro` which operates like our
`defmacro` but it's rarely used so Gambit just punts on the
source-location issue in that case).


        Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]