emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: scratch/accurate-warning-pos: Solid progress: the branch now bootstr


From: Alan Mackenzie
Subject: Re: scratch/accurate-warning-pos: Solid progress: the branch now bootstraps.
Date: Mon, 26 Nov 2018 09:48:00 +0000
User-agent: Mutt/1.10.1 (2018-07-13)

Hello, Paul.

On Sun, Nov 25, 2018 at 17:41:39 -0800, Paul Eggert wrote:
> Alan Mackenzie wrote:

> > Because of macros.  These macros are typically already compiled.

> Even a compiled macro operates via the interpreter. So we could have a
> separate interpreter used only by the byte compiler. The byte-compiler
> interpreter would operate more slowly than the normal interpreter, but
> that's OK. The main and the byte-compiler interpreter could mostly be
> written with shared code, without slowing down the main interpreter.
> Admittedly this would not be a project for the fainthearted.

Indeed not.  Where and how would this help with getting accurate source
code positions?

> > If you could come up with a solid proposal which would fix the bug
> > without slowing down Emacs at all, we'd all be most appreciative.

> I'm afraid I don't understand the bug well enough yet to know whether
> any proposal I can come up with would be "solid". For one thing, any
> method of outputting source-code locations will founder in the
> presence of macros.

scratch/accurate-warning-pos seems to do rather well in this regard.

> Even GCC, which tries to do a reasonably good job of this and isn't
> limited by the Lisp reader, doesn't do well with the sort of C macros
> I tend to write. My admittedly uninformed guess is that there is no
> such thing as a "solid" solution here, only solutions that work better
> and/or worse on various example sets.

The example set scratch/accurate-warning-pos works well on is the Lisp
code comprising Emacs.

> That being said, here's another possibility: don't bother attaching
> source-code positions to symbols, since duplicate symbols can be
> appear in the input and the source-code positions can't be retrieved
> reliably.

The source code positions are attached not to symbols, but to symbol
_occurrences_.

> Instead, attach positions to input objects that are guaranteed to be
> unique so that retrieval is trivial.

I think you mean conses here.  I've tried this approach, spending a lot
of time on it but not getting very far.  The problem is, Lisp objects
flow through lots of different conses as they are transformed by the
compiler.  Have a look at cconv-convert, which processes every function.
I'm not sure that even a single cons in the input form survives through
to the output.  The symbols do survive, though, in the main.

> Do the attachment via a hash table so that the input objects are
> unchanged and we don't need to change much of anything except the
> byte-compiler's diagnostic code (plus a read function that fills in
> the hash table as it reads).

Using conses as keys?  See previous paragraph.  The approach I tried
before to implement this was to ensure that after any source
transformation, the result was written back to the original cons using
setcar and setcdr.  This rapidly became unwieldy, with, for example,
versions of setq and mapcar which had extra parameters indicating the
result cons, and so on.  This involved extensive amendment of large
portions of the compiler.

> When the byte compiler needs the source code location corresponding to
> a symbol, it looks for the closest unique object nearby and uses its
> location.

"Nearby"?  Warning messages are typically associated with symbol
occurences (not conses), and are found when a recursive compiler routine
is presented with a symbol rather than a cons.  Not all the time, but a
lot of the time.  Hence the scratch/accurate-warning-pos approach of
attaching source positions to symbol occurrences.

> For example, the source expression for the bug#22288 test case:

>    (defun test () (let (a)) a)

> has five conses in its top level list, two conses at the top of its
> second level list (let (a)), and one cons in its third level list (a).
> Each cons corresponds to a source code position (or if you prefer more
> accuracy, multiple positions for the start and end of the
> corresponding source-code and/or for the starts and ends of the source
> code corresponding to the cons's car and cdr). This will let the byte
> compiler narrow down where every subexpression lies, with
> significantly more accuracy than what we have now. In the bug#22288
> example, the last cons in the top-level list should be attached to the
> precise source code location for the 'a' that we want to issue a
> diagnostic about.

Yes, in theory.  In practice, as already said, the source code flows
between lots of cons cells as it is transformed by functions like
cconv-convert and those in byte-opt.el.  As said, countering this would
involve lots of tedious amendment to the compiler, with the emphasis on
"lots".  I've already tried this, and given up.

-- 
Alan Mackenzie (Nuremberg, Germany).



reply via email to

[Prev in Thread] Current Thread [Next in Thread]