Re: How to add pseudo vector types

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to add pseudo vector types

From:	Eli Zaretskii
Subject:	Re: How to add pseudo vector types
Date:	Thu, 15 Jul 2021 19:48:26 +0300

> From: Yuan Fu <casouri@gmail.com>
> Date: Thu, 15 Jul 2021 12:19:31 -0400
> Cc: monnier@iro.umontreal.ca,
>  emacs-devel@gnu.org
> 
> > Why do you need to do this when a buffer is updated? why not use
> > display as the trigger?  Large portions of a buffer will never be
> > displayed, and some buffers will not be displayed at all.  Why waste
> > cycles on them?  Redisplay is perfectly equipped to tell you when some
> > chunk of buffer text is going to be redrawn, and it already knows to
> > do nothing if the buffer haven't changed.
> 
> Tree-sitter expects you to tell it every single change to the parsed text.

That cannot be true, because the parsed text could be in a state where
parsing it will fail.  When you are in the middle of writing the code,
this is what will happen many times, even if you pass the whole buffer
to the parser.  And since tree-sitter _must_ be able to deal with this
problem, it also must be able to receive incomplete parts of the
buffer text, and do the best it can with it.

> Say you have a buffer with some content and scrolled through it, so 
> tree-sitter has parsed the whole buffer. Then some elisp edited some text 
> outside the visible portion. Redisplay doesn’t happen, we don’t tell this 
> edit to tree-sitter. Then I scroll to the place that has been edited. What 
> now?

Now you call tree-sitter passing it the part of the buffer that needs
to be parsed (e.g., the chunk that is about to be displayed).  If
tree-sitter needs to look back, it will.

> I’ve lost the change information, and tree-sitter’s tree is out-dated.

No information is lost because the updated buffer text is available.

> We can fontify on-demand, but we can’t parse on-demand.

Sorry, I don't believe this is true.  tree-sitter _must_ be able to
deal with these situations, because it must be able to deal with
incomplete text that cannot be parsed without parse errors.

In addition, Emacs records (for redisplay purposes) two places in each
buffer related to changes: the minimum buffer position before which no
changes were done since last redisplay, and the maximum buffer
position beyond which there were no changes.  This can also be used to
pass only a small part of the buffer to the parser, because the rest
didn't change.

> What we can do is to only parse the portion from BOB to the visible portion. 
> So we won’t parse the whole buffer unless you scroll to the bottom.

My primary worry is the fact that you want to use buffer-change hooks
(and will soon enough want to use post-command-hook as well).  They
slow down editing, sometimes tremendously, so I'd very much prefer not
to use those hooks for fontification/parsing.  The original font-lock
mechanism in Emacs 19 used these hooks; we switched to jit-lock and
its redisplay-triggered fontifications because the original design had
problems which couldn't be solved reliably and with reasonable
performance.  I hope we will not make the mistake of going back to
that sub-optimal design.

> >> And, for tree-sitter to take the buffer’s content directly, we need to 
> >> tell it to skip the gap.
> > 
> > AFAIR, tree-sitter allows the calling package to provide a function to
> > access the text, isn't that so?  If so, you could write a function
> > that accesses buffer text via BYTE_POS_ADDR etc., and that knows how
> > to skip the gap already.
> 
> Yes, that function returns a char*. But what if the gap is in the middle of 
> the portion that tree-sitter wants to read?

If you provide the function that returns text one character at a time,
as AFAIR tree-sitter allows, you will be able to skip the gap
automagically by using BYTE_POS_ADDR.  If that's not possible for some
reason, or not performant enough, we could ask tree-sitter developers
to add an API that access buffer text in two chunks, in which case it
will be called first with text before the gap, and then with text
after the gap.  Like we do when we call regex search functions.

> Alternatively, we can copy the text out and pass it to tree-sitter, but you 
> don’t like that, IIRC.

Yes, because it means memory allocation, which could be slow,
especially for large buffers.  It could even fail if the buffer is
large enough and the system is under memory pressure.

> >> I only need to modify gap_left, gap_right, make_gap_smaller and 
> >> make_gap_larger, right?
> > 
> > Why would you need to _modify_ any of these?
> 
> Because I want to let tree-sitter to know where is the gap so it can avoid it 
> when reading text.

Knowing where is the gap doesn't need any changes to these functions.
See GPT_BYTE, GPT_SIZE, BUF_GPT_BYTE, and BUF_GPT_SIZE.  And the gap
cannot move while tree-sitter accesses the buffer, because no other
part of the Lisp machine can run at that time.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: How to add pseudo vector types, (continued)

Prev by Date: Re: Add hints to documentation of car and cdr for (e)lisp newcomers - take 2
Next by Date: Re: How to add pseudo vector types
Previous by thread: Re: How to add pseudo vector types
Next by thread: Re: How to add pseudo vector types
Index(es):
- Date
- Thread