bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Post release texi2any performance regression


From: Gavin Smith
Subject: Re: Post release texi2any performance regression
Date: Fri, 27 Oct 2023 11:57:12 +0100

On Fri, Oct 27, 2023 at 01:22:01AM +0200, Patrice Dumas wrote:
> On Thu, Oct 26, 2023 at 06:45:10PM +0100, Gavin Smith wrote:
> > 
> > > For timing optimization, it seems to me that we should have only one
> > > target, the combination were everything is done in XS.  To me there is
> > > little point in trying to optimize other combinations, because they are
> > > very unlikely to be used.  The only other combination I can imagine to
> > > be used is the perl only combination triggered by TEXINFO_XS=omit that
> > > could be used to workaround a bug in the full XS combination at the cost
> > > of performance.
> > 
> > What about the case of when an XS converter hasn't been written yet?
> 
> The perl data is always rebuilt for now, so perl converters will
> work as usual.

I wasn't talking about whether they would work or not, I was talking about
how fast they would run.

At the moment, performance improvements from the new code seem hypothetical.
For example, in the Info/plaintext output, it is slower than before,
presumably to new processing that didn't take place before.  In this case,
there is the XS parser, other XS processing, and then a plain Perl converter
with a few XS functions.  This is not "everything is done in XS", but a
case that is worth being concerned about.

It seems like it should be possible with a bit of tweaking to avoid this
slowdown without rewriting the whole Plaintext converter in C.


> As a side note, it is not clear to me for which converter the work
> needed to translate perl to XS/C is worth the speed increase (besides
> HTML).  For Plaintext/Info there is already XS code for the critical
> code, we can decide to do more at any time now that the C/XS structure
> are ready to use, but the gains will probably be lower.  For Texinfo
> XML, I think that it is not useful as it is probably not used much.  For
> Docbook and LaTeX, I have no idea whether those converters are used by
> more than a few people, I would think that they are not actually used
> much either.

Info and HTML are the most important targets, certainly.  The code to
sort indices is another part of the processing that takes significant
time.

If you profile with NYTProf, you will see for Info output, still very
considerable time is spent in the converter, operating on the tree
recursively.

It might get to the point when texi2any is usually fast enough and it is
not worth the trouble to rewrite further code in C.

Here's the way I think about it:

t is run time.

          t < 100 ms - short enough

100 ms <= t < 500 ms - probably short enough but making it shorter would
                       make the user happier

500 ms <= t < 10 s   - important to improve it to avoid user frustration

 10 s  <= t          - not so important to improve as by this point the
                       user isn't sitting watching the program run and
                       has probably gone and done something else

These thresholds may not be correct but it gives the idea.

(Incidentally I find the vocabulary around this difficult as the
computer isn't running "fast" or "slow", it's just running the same speed
but doing more or less processing.)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]