texmacs-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Texmacs-dev] Cache profiling of TeXmacs 1.0.3.9


From: David Allouche
Subject: Re: [Texmacs-dev] Cache profiling of TeXmacs 1.0.3.9
Date: Fri, 21 May 2004 01:25:53 +0200
User-agent: Mutt/1.5.5.1+cvs20040105i

On Thu, May 20, 2004 at 04:17:17PM +0200, David MENTRE wrote:
> David MENTRE <address@hidden> writes:
> 
> > Following David advice, I have done some cache profiling of texmacs
> > using OProfile.
> 
> I just realized that what I provided was not what David expected. He
> wanted to know where CPU is spent. Using CPU_CLK_UNHALTED event gives a
> more detailed event of that.

Thanks for that good work.

Here is my interpretation:

  1. Guile is crap, but we already knew that.

     It seems that most the cache misses are spent in the garbage
     collector. I'd bet that GUILE is doing complete collections,
     instead of using a generational/incremental collection, which is
     much nicer on the cache. The quality of the garbage collector of a
     Scheme implementation is a point that should be studied very
     carefully.

     But TeXmacs does not help it much either. When working with the
     hide-show plugin, which uses big trees, I have discovered that the
     fact that that the size of tree objects which is reported to the
     garbage collector is wrong (only the root node, not the whole tree)
     can cause guile to spend all its time collecting, bringing TeXmacs
     to a halt. I believe that happens because it does run out of
     memory, but it does not increase the heap size appropriately since
     it believe it manages much less memory than it does actually. 

  2. TeXmacs has many cache misses, but they seem to happen all over the
     code. So, the issue are either general design problems, or are
     faults in commonly used data structures.

     Here, my usual suspects are needless indirections. They might make
     not a big difference. But that's something which should be tried.

  3. The top faulters are counted pointer constructors and destructors
     (not real objects ctors and dtors, just pesky counted pointers)
     and the memory allocator.
     
     It makes sense to try to remove needless pointer counting,
     especially in foundation routines, then see what is left of the
     problem.
     
     Regarding the allocator, there is probably a lot of good
     bibliography on that very non-trivial issue. The TeXmacs allocator
     is reasonable, but it seems naive. Using pools for related objects
     might help, but that may be difficult to implement and the result
     are uncertain.

  4. Another remarkable faulter is the string comparison.

     The names of all typesetter variable should be interned, and
     string objects could be either interned or self-contained. That
     will add a boolen to string_rep, but that's probably worth the
     cost. To make the best use of cache loads, interned strings should
     be stored in a separate pool, whose size can be known
     at compilation time.

Finally, anything that reduces memory consumption (like removing pointer
counting bloat from the code or interning strings) can yield performance
increases because of reduction of distributed cache misses.

Surpringly, this interpretation goes with the changes I just suggested
in my previous message :) So, either I was remarkably insightful, or I'm
being very biased, or both.

Other interpretations welcome.

Good night to people near GMT.

-- 
                                                            -- ddaa




reply via email to

[Prev in Thread] Current Thread [Next in Thread]