[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Using incremental parsing in Emacs
From: |
Dmitry Gutov |
Subject: |
Re: Using incremental parsing in Emacs |
Date: |
Fri, 10 Jan 2020 00:56:38 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 |
On 04.01.2020 16:46, arthur miller wrote:
There is a very good presentation of tree-sitter on YT by its author:
https://www.youtube.com/watch?v=Jes3bD6P0To
Looks much better then what I got a picture by just reading on the
website:
It was a good watch.
Some takeaways from me:
It implements a GLR parser. One that can update the existing AST quickly
for an arbitrary edit in the middle of a file. (*)
But it parses a new file quickly as well: a 20000 lines JS file in 54ms.
To be able to reach that speed, they went the traditional
compiler-writer route of having a separate (grammar-to-C-code)
compilation step from a grammar to a parser program (which relies on a
shared runtime). (**)
Some of it seems to be by necessity. Every run returns a full AST, not
just an "AST up to this position". I suppose the author didn't want the
problems that come with unfinished parse trees when code relies on that
returned value. (***)
The generated parser, in addition to being incremental, is
error-tolerant, which is a necessity for use in editors.
As a result, they have features like fast semantic syntax highlighting,
as well code folding that accurately detects where function body begins
and ends (previously, Atom and other editors used guessing based on
indentation levels, apparently). And a "extend selection" command based
on AST as well (****)
Tree-Sitter is also used inside GitHub for various features, including
their Semantic library (which implements code navigation on the web).
In the meantime, our current answer to all of the above is syntax-ppss
plus local regexp-based parsing around the visible part of the buffer.
To compare:
(*) syntax-ppss is also fully incremental, although the returned value
is a very simplistic substitute for an AST. But we've been using it for
a while and have done solid things with it.
(**) Which means that if we try to use Tree-Sitter as-is, our current
practice of defining the language grammar in Lisp would go our of the
window. https://github.com/ubolonton/emacs-tree-sitter demonstrates this
as well: language grammars have to be compiled into a shared library (or
libraries). We would have lots of grammars supplied by the third party,
which is kind of good, but we would lose the ease of experimenting with
them that we have now, or being able to write support for a new
up-and-coming language very quickly. Which a certain fraction of our
users enjoys, AFAIK.
(***) Whereas syntax-ppss stops at a requested position, thus saving on
CPU cycles this way. Similarly, if a new system we'll transition to
someday also does this, its absolute performance/throughput would be
less important if it only usually has to parse a screen-worth of file at
a time.
(****) We've been managing surprisingly well with syntax-ppss,
forward-sexp, etc. So code folding works quite well in Emacs already,
and the easy-kill package in GNU ELPA does the "expand selection" thing
very successfully as well. But we could use some improvement in having
some more complex syntax supported or handled more easily, in certain
languages. Having a "proper AST" available is nothing to sneeze at
either, and would likely help a lot in indentation code.
My personal takeaway is that we could really benefit from a lispier
version of this technology, and Someone(tm) should start working on that.
Re: Using incremental parsing in Emacs, arthur miller, 2020/01/04
- Re: Using incremental parsing in Emacs, Alan Third, 2020/01/05
- Re: Using incremental parsing in Emacs,
Dmitry Gutov <=
- Re: Using incremental parsing in Emacs, Eli Zaretskii, 2020/01/10
- Re: Using incremental parsing in Emacs, Dmitry Gutov, 2020/01/10
- Re: Using incremental parsing in Emacs, Eli Zaretskii, 2020/01/11
- Re: Using incremental parsing in Emacs, Dmitry Gutov, 2020/01/11
- Re: Using incremental parsing in Emacs, Eli Zaretskii, 2020/01/11
Re: Using incremental parsing in Emacs, Yuan Fu, 2020/01/04
Re: Using incremental parsing in Emacs, Stefan Monnier, 2020/01/04