[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Texmacs-dev] experimental garbage collector patch
[Texmacs-dev] experimental garbage collector patch
Thu, 4 Mar 2004 17:20:58 +0100
For people which have been wondering what I have been doing for a week,
I have some new stuff: an experimental patch to use Hans Boehm & al.
conservative garbage collector in texmacs as a replacement for the
current pointer counting memory management scheme.
The expect result of the change is to get better performance for
operations which operate on highly localized memory areas, that is the
bulk of the day to do editing operations.
As one can expect, the patch is quite heavywheight but not very much
intrusive. It also includes a number of cleanups and fixes for bugs
which have been made apparent by the use of garbage collection.
The bad news is this patch cause texmacs to use a lot more memory and
The immediate issue is that the partial display invalidation code
relied on deterministic destruction of objects and this assumption is
broken by the garbage collectior. As as consequence, any modification to
the edit tree cause a full display invalidation.
There are other issues, especially with the "resource" system and many
opportunities to experiment. The full comment of the patch follows.
---- Patch comment ----------------------------------------
This patch allows building texmacs with libgc (Hans Boehm). When GC is
used, all reference counting is disabled.
Currently, GC does not seem to provide a significant gain in time
performance and can increase the memory consumption a lot. However,
there is still a lot of room for tuning and fixing internal problems.
The benchmark used during developement was starting up texmacs and
generating the full documentation. Since GC was expected to yield
performance gains in code paths exhibiting a high locality of memory
access (because of better cache behaviour and lesser gc times) this is
a very unfavorable test. Yet the CPU usage is only very slightly
increased by GC.
Increased memory usage may, or may not, be caused by the leaky nature
of conservative gc. It is natural for GC to require more memory than
pointer counting, because some slack is required to reduce the number
of collections, but it seems that texmacs _does_ leak a significant
(but bounded) amount of memory because of conservative collection.
More empirical evidence would be needed to show the leak stays
Currently two issues are clearly identified:
1. Partial display invalidation is broken.
The partial display invalidation feature relies on non-trivial
logic in the phrase_box_rep destructor. This has been worked
around by sending an invalidate_all when the edit tree is
Since destruction time is (generally) not deterministic with this
GC, this design cannot work in any sane solution way. Anyway,
destructors should _only_ be used to release resources, the
invalid region compution is just an awful plate of spoiled
spaghetti. Anyway, that's what I think.
2. The resource system is inefficient.
The use of the garbage collector has made it possible to release
all inaccessible resources (save for the weak pointer in the
"instances" dictionnary). It obviously caused some previously
unused code paths to be used, since that required a fix in the
loading of virtual fonts. It also causes many resources to be
loaded multiple times when generating the full documentation.
It seems that the good performance of the pointer-counting
implementation relied on memory leaks...
There are open areas for experimentation:
-- Responsiveness benchmark
The "build full manual" benchmark is instructive and helps catch
a lot of bugs. However it gives not information on the
interactive behavior of texmacs, which is what users really see.
In order to assess the performance of GC compared to pointer
counting, we need benchmarks measuring the interactive reaction
time in realistic editing and browsing situations.
-- Remove uneeded indirections
The whole application use senseless indirection for all objects,
likely causing important memory locality issues. For example, in
most cases, it would make sense for objects to include bodies of
compound objects (arrays, hashmaps) instead of pointers, because
the compound object cannot have a significantly longer lifetime
than its including object.
Elimination of unedded indirection should focus on objects which
are created in great numbers: basic data structures and
This cleanup may, or may not, need to rely on GC.
-- Mutation hints for the collector
Use GC_change_stubborn and GC_end_stubborn_change when the
pointers included in an object are changed. That might improve
the collector performance.
-- Other design changes
TeXmacs was designed from the ground up for pointer counting.
This has some obvious consequences (already pointed out) but it
may include other designs which were required for pointer
counting but have a negative impact on GC performance.
When the outstanding issues have been fixed, and if further
experimentation demonstrates better performance of garbage collection,
texmacs could be modified to use a precise GC. Since implementing a
good GC is _really_ a non-trivial issue, a third party library should
The only precise garbage collector for C++ I have found:
David Allouche | GNU TeXmacs -- Writing is a pleasure
Free software engineer | http://www.texmacs.org
http://ddaa.net | http://savannah.gnu.org/projects/texmacs
address@hidden | address@hidden
TeXmacs is NOT a LaTeX front-end and is unrelated to emacs.
|[Prev in Thread]
||[Next in Thread]|
- [Texmacs-dev] experimental garbage collector patch,
David Allouche <=