bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#59415: 29.0.50; [feature/tree-sitter] c-ts-mode fails to fontify a p


From: Theodor Thornhill
Subject: bug#59415: 29.0.50; [feature/tree-sitter] c-ts-mode fails to fontify a portion of a large C file
Date: Sun, 20 Nov 2022 21:33:06 +0100

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Theodor Thornhill <theo@thornhill.no>
>> Cc: Yuan Fu <casouri@gmail.com>
>> Date: Sun, 20 Nov 2022 20:54:05 +0100
>> 
>> > Observe that fontifications stop at this line for some reason.
>> > Fontification reappears on line 209271.  Maybe it's because of the many
>> > braces that appear in warning face?  Why does TS think there are syntax
>> > errors here?  The C++ TS parser doesn't have that problem, btw.
>> 
>> It seems the c parser definitely can't handle what it's seeing.
>
> Yes, but do you have any clue why it gives up at that line?
>

No, not yet.


> One thing that I see is that many braces around there are shown in warning
> face, so perhaps the parser is overwhelmed by the amount of parsing errors?
>

Yeah that's my first guess, but that shouldn't be an issue, it should be
able to font-lock _something_.

>> > P.S. Btw, isn't the treesit-max-buffer-size limit too low?  4 MiB?
>> 
>> It might be!  IIRC treesit uses 10x the buffer size to store the ast, so
>> it'll be some more memory usage.
>
> After lifting the limit to allow visiting the file, this file causes Emacs
> to go up to 350 MiB.  Which is significant, but definitely not outrageous
> enough to prevent using TS with this file.  And I'm sure "normal" C files
> (as opposed to ones written by a program) will need less memory.  So 4 MiB
> sounds too restrictive to me.  We should maybe increase that to 15 MiB on
> 32-bit systems and say 40 MiB on 64-bit?
>

I think it should probably be the same as in the C level, as I mentioned
in the other mail?

>> I'll do some more digging, but in the
>> meantime I attach this profiler report that shows font-locking as the
>> culprit:
>
> Culprit for what?  For slow performance?

Yeah.

> Don't get me wrong: from my POV, TS works here better than CC Mode, in
> many use cases which are much more important than scrolling through
> the entire humongous file top to bottom.  For example, just visiting
> the file takes 3 times as much with CC Mode as with c-ts-mode; going
> to EOB with CC Mode takes more 1 min 20 sec, whereas TS does it in 2.5
> sec.  And likewise jumping into a random point in the file.  Instead
> of Alan's 150 sec for a full scroll by CC Mode I get 27 min.  The
> number of GC cycles with CC Mode is 10 times as large as with TS.
> (Caveat: my Emacs is built without optimizations, whereas Tree-sitter
> and the language support libraries are, of course, fully optimized.)
>

Ok, that's good to know!

>> In this profile I followed your repro, and did some more movement around
>> the buffer after.  This isn't from emacs -Q, but I believe the results
>> will be just the same, considering where the slowness seems to be
>> 
>> 
>>        16695  85% - redisplay_internal (C function)
>>        16695  85%  - jit-lock-function
>>        16695  85%   - jit-lock-fontify-now
>>        16695  85%    - jit-lock--run-functions
>>        16695  85%     - run-hook-wrapped
>>        16695  85%      - #<compiled -0x156eddb48a262583>
>>        16695  85%       - font-lock-fontify-region
>>        16695  85%        - font-lock-default-fontify-region
>>        16679  84%         - treesit-font-lock-fontify-region
>
> Yes, treesit-font-lock-fontify-region takes the lion's share.  If you or
> Yuan can speed this up, please do.  But I see no reason to consider this a
> catastrophe, quite to the contrary.

I think it boils down to getting the root too many times.  In an
unmodified buffer I think getting the root node should be instant, and
it seems to take some time.  I'll try to figure out why.

Theo





reply via email to

[Prev in Thread] Current Thread [Next in Thread]