emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using incremental parsing in Emacs


From: Stephen Leake
Subject: Re: Using incremental parsing in Emacs
Date: Sat, 04 Jan 2020 11:26:38 -0800
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (windows-nt)

Eli Zaretskii <address@hidden> writes:

>> From: Stephen Leake <address@hidden>
>> Date: Fri, 03 Jan 2020 15:53:45 -0800
>> 
>> The interface should look like LSP; it aims to support everything an IDE
>> needs from a "language server" (ie parser), and allows for custom
>> extensions where it falls short.
>
> Maybe I'm the odd one out, but I don't think I have a clear idea of
> what the "LSP interface" entails.  Would you (or someone else) post a
> summary, or point to some place where this is described succinctly
> enough to not require a long study?

The full description is at
https://microsoft.github.io/language-server-protocol/specifications/specification-3-14/
However, that document apparently only describes commands sent to the
server, not the responses sent from the server.

My attempt at a summary, in the form of a description of how LSP is used
in a typical editing session:

User visits a file who's major mode supports LSP. Emacs starts or
connects to a language server for that language (this can be customized
in eglot to be per-project, and in other ways).

Emacs sends the entire file contents to the server. For every edit the
user makes after that, the edit it sent to the server; the message
contains deleted and inserted text. It is up to Emacs how much
insert/delete to include in each message to the server; I assume it is
not every character. Sending that message from after-change-hook would
be a natural choice, but it might be better to cache the information in
order to send fewer messages.

When font-lock is triggered, Emacs sends a request for formatting
a range to the server (LSP command ‘textDocument/rangeFormatting’); the
server sends back new text for that range, with proper indentation and
capitalization. I assume it also supports faces via some markup in the
JSON, but I have not seen that in the docs.

Similarly, when the user requests indentation (via TAB or some other
command), a format request is sent.

When the user starts typing a function call (or otherwise requests
completion), a textDocument/completion request is sent to the server; it
responds with the possible completions of the function name, and then
the parameter list.

> We did learn one important thing from using LSP servers: that
> processing the JSON stuff back and forth adds non-trivial overhead and
> slows down the application enough to annoy users, even after we did
> all we can to speed up the translation.  

Ok. I did not follow that in detail. Do we have any speed comparisons
with other editors?

> So I think it makes sense to take one more look at the issue and see
> if we can come up with better interfaces, which will suit Emacs
> applications better and allow faster processing. 

There is always a tradeoff between speed and flexibility. The ada-mode
interface to the external process is highly optimized to do exactly what
ada-mode currently needs, and is very fast. But it is also brittle;
adding new features may require large changes, and causes version
incompatibility. LSP is much more flexible, allowing expansion to new
features easily, and allowing feature negotiation.

Other editors seem to cope well with the json approach, so it should be
possible for Emacs as well.

> Using a library that processes stuff locally would then allow us to
> implement such interfaces more easily, since we will be free from the
> restrictions imposed by the need to communicate with external
> processes.

I gather you are suggesting that the language server could be an Emacs
module (or even an elisp package), with function calls for the various
features. That is certainly possible, but loses the ability to use any
server developed external to the Emacs project.

It might be possible to refactor some servers to work that way
(replacing the json interface with a direct function call interface),
but it would be a lot of work.


> we'll most probably want some combination of LSP-based and local
> parsers-based features.  E.g., it's quite possible that LSP servers
> could be better for some complex jobs, where speed matters less.
>
> My point is that we shouldn't lock up our minds, not yet anyway.  A
> fresh look at these issues, taking the incremental parsing into
> account, could benefit us in the long run.

Ok.

I will work on adding LSP support for ada-mode (reusing eglot and/or
lsp-mode), and see what might be done about the speed issues. I need to
do that anyway to support a customer request.

I can also look at moving the current Ada parser into an Emacs module,
to see if that helps with speed.

>> LSP language servers are implemented in some compiled language, not
>> elisp; eglot/lsp-mode is just the elisp side of the protocol. The elisp
>> sends edits and info requests (ie, "insert/delete this text at this
>> point", "fontify/format this range") to the server, and handles the
>> responses.
>
> I'm saying we should look into this and see whether there are better
> ways that that.  Suppose such a server had direct access to buffer
> text: would that allow a more efficient interface than the above?  

No; lexing the actual text is not where the time is spent.

> We should definitely support LSP.  We already do, albeit in
> third-party packages.  We added native JSON support and jsonrpc for
> doing this better.  If there's anything else we can do in that
> direction, people should speak up.

Ok.

> But my point is that LSP is not necessarily the only game in town we
> should support.  For example, font-lock doesn't use LSP, and probably
> never will, due to performance issues; 

Ada-mode uses the external process to compute faces for identifiers.
That works well, although I do (setq jit-lock-defer-time 1.0) so it only
fontifies when I pause typing; otherwise there can be an annoying delay
after each character.

However, doing correct font-lock for Ada without a parser is pretty much
impossible (on anything more than language keywords), and there is very
little that can be done to speed up the parsing. Migrating the parser
into a module might help, but only a little. Adding a json interface
would slow it down, of course.

> should we improve font-lock using infrastructure that's based on
> language parsing? 

ada-mode builds on the current font-lock infrastructure; the font-lock
timer triggers a parse on a range, and the parse actions set
font-lock-face text properties.

> And there are other features that could benefit, I've mentioned them.
> If you are saying they all should just use LSP, then I don't think I
> agree.

I'm saying they all could use LSP in principle, but I have not had any
experience actually doing that, so it may not work very well in
practice.

I don't think you are objecting to LSP in principle, but do have a
problem with the speed penalty due to using JSON. Since other editors
are succeeding with that, perhaps there is more Emacs could do here.

-- 
-- Stephe



reply via email to

[Prev in Thread] Current Thread [Next in Thread]