[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using incremental parsing in Emacs

From: Stephen Leake
Subject: Re: Using incremental parsing in Emacs
Date: Fri, 03 Jan 2020 11:39:50 -0800
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (windows-nt)

Eli Zaretskii <address@hidden> writes:

> Would someone like to try to figure out how we could use the
> incremental parsing technology in Emacs for making our
> programming-language support more accurate and efficient?  

GNU ELPA ada-mode is an existing example; it has a full language parser
(error-correcting generalized LR), that supports some advanced
navigation. It could be extended to do some code completion.

Instead of "incremental parsing" (which updates an existing syntax tree
given source changes) it uses "partial parsing" (parsing only part of a
file) and very robust error handling. It works very well on very large
Ada files (it is in production use by Eurocontrol and others).

Error correction is critical, since buffers are normally not syntactically
correct during editing.

I've tried using the same parser generator on Java and Python; the
results are not as good as for Ada (apparently Ada lends itself to LR
parsing better than those languages). That might be improved by
massaging the grammar, but that risks implementing not-quite-Java,

Others mentioned LSP (https://langserver.org/); that method supports
incremental parsing, since it is centered on sending source edits from
the editor to the language server (after sending the full text once). It
also supports algorithms that require more than one source file, since
all files involved in a project can be loaded into the same language
server instance (the ada-mode parser is strictly one file). That allows
providing completion on parameters for functions declared in other files,
for example.

Many editors are moving to support LSP; that allows them to take
advantage of any parser technology developed independently.

ada-mode has its own protocol between elisp and the external parser,
provided by the GNU ELPA wisi package (the ada-mode parser was started
before LSP). The parser in ada-mode could be used in an LSP language

So I think the short answer to your post is "GNU ELPA eglot", with
possibly some work importing some of that into core to make it more
efficient. eglot is currently listed as "incompat" in *Packages* (in
both emacs 27 and 26); I don't know why. I have not tried eglot; I don't
know how complete it is. There is also

The syntax used for expressing the grammar is usually fairly tightly
tied to the language and/or the parser generator; trying to generalize
that for all languages supported by Emacs is a huge task, not worth
doing. With LSP, building a grammar for a langauge is done once for each
language server.

Whether the language server is implemented as an external process, or as
a loadable module, is an implementation detail. ada-mode uses an
external process, mostly because it was started before modules were
stablilized. The communications between the language server and elisp
(whether ada-mode style or LSP) involves sending text, not binary data
(and _not_ pointers into the emacs buffer!). Doing that via the module
interface vs pipes to a process is a wash for speed. Using a process
fully isolates the server code from emacs, eliminating any possible
third-party library version conflicts.

It could be possible to implent an LSP language server in elisp, running
in a separate thread (or even the same thread; it can be used
synchonously). That might be an interesting excercise, and would
eliminate other language dependencies. ada-mode used to support an elisp
parser generated from the same grammar, but that never supported error
correction; implementing very complex algorithms is just easier in a
more advanced language (and certainly faster at run time; critical for
error correction).

-- Stephe

reply via email to

[Prev in Thread] Current Thread [Next in Thread]