bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#59574: 29.0.50; Emacs crashes when using tree-sitter-based mode in a


From: Yuan Fu
Subject: bug#59574: 29.0.50; Emacs crashes when using tree-sitter-based mode in an empty buffer
Date: Fri, 25 Nov 2022 19:18:09 -0800


> On Nov 25, 2022, at 7:04 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> To reproduce:
> 
>  emacs -Q
>  C-x C-f foo.c RET
>  M-x c-ts-mode RET
>  Type "in"

Thanks for finding this out! 

> 
> Make sure foo.c doesn't exist, so you start from an empty buffer.  As soon
> as you type the second character of "in", there's an assertion violation:
> 
> treesit.c:1383: Emacs fatal error: assertion failed: end_byte <= BUF_ZV_BYTE 
> (bu
> ffer)
> 
>  Thread 1 hit Breakpoint 1, terminate_due_to_signal (sig=22, 
> backtrace_limit=2147483647) at emacs.c:427
>  427       signal (sig, SIG_DFL);
>  (gdb) up
>  #1  0x01230802 in die (
>      msg=0x18e6778 <DEFAULT_REHASH_SIZE+3288> "end_byte <= BUF_ZV_BYTE 
> (buffer)", file=0x18e5fcc <DEFAULT_REHASH_SIZE+1324> "treesit.c", line=1383)
>      at alloc.c:7697
>  7697      terminate_due_to_signal (SIGABRT, INT_MAX);
>  (gdb)
>  #2  0x01355636 in treesit_make_ranges (ranges=0x856a778, len=1,
>      buffer=0x7fe94b0) at treesit.c:1383
>  1383          eassert (end_byte <= BUF_ZV_BYTE (buffer));
>  (gdb) p end_byte
>  $1 = 4
>  (gdb) p BUF_ZV_BYTE(buffer)
>  $2 = 3
> 
> Interestingly, this only happens once, when the buffer includes exactly 1
> byte and an additional character is inserted.  If you get past this
> assertion, further characters can be inserted without any problems, and
> end_byte always equals BUF_ZV_BYTE.
> 
> The backtrace is below, if it is interesting.
> 
> I couldn't figure out where did tree-sitter take the range it returns to us.
> Yuan, can you describe how does the parser get the range it needs to
> consider?  If I put a breakpoint in treesit-parser-set-included-ranges, the
> breakpoint never breaks, so this doesn't seem to be how the range is set in
> this scenario.

After we parse the buffer (in treesit_ensure_parsed) we compute the ranges that 
has changed since last parse, by calling ts_tree_get_changed_ranges, and pass 
the ranges to notifier functions (those added by treesit-parser-add-notifier). 
This range is different from the range within which a parser operates. That 
range is set by treesit-parser-set-included-ranges, and is not involved with 
the parsing, treesit_record_changes, visible_beg/end stuff.

Both feature happens to use treesit_make_ranges as a helper function, but the 
similarity ends there.

> There's also something strange in treesit_record_change: when it is called
> for the first time in a buffer which was empty and you insert one character,
> we bypass the updating of visible_beg and visible_end fields of the Lisp
> parser object, because XTS_PARSER (lisp_parser)->tree is NULL.  But it looks
> to me that we should still update these two fields regardless, no?  Only the
> call to treesit_tree_edit_1 needs the tree.  (I thought that maybe this lack
> of update explains the assertion, but even if I move the condition to guard
> only treesit_tree_edit_1, the assertion still happens, so I guess my
> hypothesis eats dust.)

We don’t need to update visible_beg/end in treesit_record_change if tree is 
NULL, because visible_beg/end represents the range of buffer that the tree 
sees, so if there is no tree, visible_beg/end can be considered uninitialized. 
However you are right about needing to update visible_beg/end, but in 
treesit_ensure_position_synced (I renamed it to treesit_sync_visible_region): 
that’s where we ensure visible_beg/end equals to BUF_BEGV_BYTE/friends. 

The problem is we don’t update visible_beg/end for the very first parse, when 
tree is NULL.

I also added some comments, hopefully they sufficiently explain everything.

Yuan






reply via email to

[Prev in Thread] Current Thread [Next in Thread]