[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Post release texi2any performance regression
From: |
Patrice Dumas |
Subject: |
Re: Post release texi2any performance regression |
Date: |
Sun, 29 Oct 2023 00:09:00 +0200 |
On Sat, Oct 28, 2023 at 05:42:50PM +0100, Gavin Smith wrote:
> I managed to disable a lot of the new XS code and get the test suite
> to pass. I had to leave the XS translation module active due to the
> coupling that now exists between it and the XS parser.
Also I doubt that any slowdown could come from doing in C the code that
was done in perl previously in Parsetexi.pm. To me having this code in
XS is both more logical and (probably) faster.
> As you can see, my attempt at disabling the new modules reverses most of,
> but not all, of the slowdown.
I think that you can also comment out rebuild_document when none of the
XS is overriding the perl code, but I have not tested.
One thing that comes to my mind is that I removed simple_parser
https://git.savannah.gnu.org/cgit/texinfo.git/commit/?id=4a3d02c0fc1932350d925fb957e0758a5290436c
it could explain some increase of the time used by gdt.
> I'm still trying to find causes for the remaining slowdown. I profiled
> with NYTProf and think that build_document is one possibility, as it
> may does more than build_texinfo_tree did.
I do not think so. The only additional things it does (store
identifiers_target) correspond to the fact that
set_labels_identifiers_target is now done in C instead of using perl
code as it did previously, but I dount that it requires much more time.
However, even if it does not do more, doing it twice could be a reason
for the slowdown if the time passed in build_texinfo_tree and other
parser results passing to perl codes is important.
> For the glibc manual, it is
> called 2412 times (at least once per parser object). As you know,
> there is a new parser for every @def* command in the Texinfo sources,
> so per-parser overhead can be significant.
I do not get it. If you are speaking about the translation happening in
complete_indices, calls of gdt_tree -> gdt -> replace_convert_substrings
do not require a new parser, the current parser is reused. There is
is still a parsing and a storing of a document that is later on
removed, plus substitutions in the tree. But still this should be
faster than the same code in perl.
> I see there are also
> changes to index sorting, but haven't investigated them enough to
> understand if this would have a performance impact.
Hopefully this should have a positive impact by caching some regexps
results.
> It was important to be able to disable these new modules in order to see
> this remaining slowdown. I still argue for making it easy to cleanly
> disable these new modules unless or until they do not slow down the program
> as much.
It seems like this could be relatively easy, by adding a variable which
is tested when loading XS code and that's it, unless I missed something?
> If the
> promised benefits of the new development never materialised, it would
> mean that the post-7.1 development of texi2any was not worth pursuing.
I would be very surprised if there was no speedup of the HTML converter.
Right now it is very slow, with the main loop in C it should be much
faster.
> This is from my perspective of somebody who is not familiar with the
> new code and doesn't understand how it all works. I've spent hours
> trying to work this out over the last few days because I view it as a
> threat to the future development of the program.
The slowdown is not that big, that being said I agree that it would be
nice to understand why with XS for structure/transformations it is
slower than with perl.
> If the Perl object for the parse tree is built twice, this is a definite
> problem, and something that needs to be remedied before the new XS code
> can be considered to be in a finished state.
To me it is not in a finished state before the HTML converter main loop
is fully in C when there is no user customization.
--
Pat
- Post release texi2any performance regression, Gavin Smith, 2023/10/21
- Re: Post release texi2any performance regression, Patrice Dumas, 2023/10/24
- Re: Post release texi2any performance regression, Gavin Smith, 2023/10/25
- Re: Post release texi2any performance regression, Patrice Dumas, 2023/10/25
- Re: Post release texi2any performance regression, Gavin Smith, 2023/10/26
- Re: Post release texi2any performance regression, Patrice Dumas, 2023/10/26
- Re: Post release texi2any performance regression, Gavin Smith, 2023/10/27
- Re: Post release texi2any performance regression, Patrice Dumas, 2023/10/28
- Re: Post release texi2any performance regression, Gavin Smith, 2023/10/28
- Re: Post release texi2any performance regression,
Patrice Dumas <=
- Re: Post release texi2any performance regression, Patrice Dumas, 2023/10/29
- Re: Post release texi2any performance regression, Gavin Smith, 2023/10/30