emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to add pseudo vector types


From: Yuan Fu
Subject: Re: How to add pseudo vector types
Date: Thu, 15 Jul 2021 14:23:02 -0400


> On Jul 15, 2021, at 12:48 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Thu, 15 Jul 2021 12:19:31 -0400
>> Cc: monnier@iro.umontreal.ca,
>> emacs-devel@gnu.org
>> 
>>> Why do you need to do this when a buffer is updated? why not use
>>> display as the trigger?  Large portions of a buffer will never be
>>> displayed, and some buffers will not be displayed at all.  Why waste
>>> cycles on them?  Redisplay is perfectly equipped to tell you when some
>>> chunk of buffer text is going to be redrawn, and it already knows to
>>> do nothing if the buffer haven't changed.
>> 
>> Tree-sitter expects you to tell it every single change to the parsed text.
> 
> That cannot be true, because the parsed text could be in a state where
> parsing it will fail.  When you are in the middle of writing the code,
> this is what will happen many times, even if you pass the whole buffer
> to the parser.  And since tree-sitter _must_ be able to deal with this
> problem, it also must be able to receive incomplete parts of the
> buffer text, and do the best it can with it.
> 
>> Say you have a buffer with some content and scrolled through it, so 
>> tree-sitter has parsed the whole buffer. Then some elisp edited some text 
>> outside the visible portion. Redisplay doesn’t happen, we don’t tell this 
>> edit to tree-sitter. Then I scroll to the place that has been edited. What 
>> now?
> 
> Now you call tree-sitter passing it the part of the buffer that needs
> to be parsed (e.g., the chunk that is about to be displayed).  If
> tree-sitter needs to look back, it will.
> 
>> I’ve lost the change information, and tree-sitter’s tree is out-dated.
> 
> No information is lost because the updated buffer text is available.
> 
>> We can fontify on-demand, but we can’t parse on-demand.
> 
> Sorry, I don't believe this is true.  tree-sitter _must_ be able to
> deal with these situations, because it must be able to deal with
> incomplete text that cannot be parsed without parse errors.
> 
I think my assertion was too strong. By “can’t parse on-demand” I mean we can’t 
easily pass tree-sitter a random chunk of text and not letting it to parse from 
BOB. 

> In addition, Emacs records (for redisplay purposes) two places in each
> buffer related to changes: the minimum buffer position before which no
> changes were done since last redisplay, and the maximum buffer
> position beyond which there were no changes.  This can also be used to
> pass only a small part of the buffer to the parser, because the rest
> didn't change.
> 
>> What we can do is to only parse the portion from BOB to the visible portion. 
>> So we won’t parse the whole buffer unless you scroll to the bottom.
> 
> My primary worry is the fact that you want to use buffer-change hooks
> (and will soon enough want to use post-command-hook as well).  They
> slow down editing, sometimes tremendously, so I'd very much prefer not
> to use those hooks for fontification/parsing.  The original font-lock
> mechanism in Emacs 19 used these hooks; we switched to jit-lock and
> its redisplay-triggered fontifications because the original design had
> problems which couldn't be solved reliably and with reasonable
> performance.  I hope we will not make the mistake of going back to
> that sub-optimal design.

I understand. I want to point out that parsing is separated from fontification, 
and syntax-pass flushes its cache in before-change-hook. I was hoping to use 
the parse tree for more than fontification, e.g., motion commands like 
sexp-forward/backward or structural editing commands like expand-region. 
Another scenario: some elisp edited some text before the visible portion, the 
tree is not updated, now I want to select the node at point (like 
expand-region), I look for the leave node that contains the byte position of 
point. However, because the tree is out-dated, the byte position of point will 
not correspond to the node I want.

We can still fontify with jit-lock, it’s just parsing cannot easily work like 
fontification, I expect tree-sitter to work similarly to syntax-pass rather 
than jit-lock.

> 
>>>> And, for tree-sitter to take the buffer’s content directly, we need to 
>>>> tell it to skip the gap.
>>> 
>>> AFAIR, tree-sitter allows the calling package to provide a function to
>>> access the text, isn't that so?  If so, you could write a function
>>> that accesses buffer text via BYTE_POS_ADDR etc., and that knows how
>>> to skip the gap already.
>> 
>> Yes, that function returns a char*. But what if the gap is in the middle of 
>> the portion that tree-sitter wants to read?
> 
> If you provide the function that returns text one character at a time,
> as AFAIR tree-sitter allows, you will be able to skip the gap
> automagically by using BYTE_POS_ADDR.  If that's not possible for some
> reason, or not performant enough, we could ask tree-sitter developers
> to add an API that access buffer text in two chunks, in which case it
> will be called first with text before the gap, and then with text
> after the gap.  Like we do when we call regex search functions.

Yes, I make a mistake reading the api. Indeed we can read one character at a 
time, and gap is not an issue anymore.

Yuan


reply via email to

[Prev in Thread] Current Thread [Next in Thread]