|
From: | Juergen Sauermann |
Subject: | Re: [Bug-apl] Boehm garbage collector |
Date: | Sat, 31 May 2014 15:25:39 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130330 Thunderbird/17.0.5 |
Hi,
in GNU APL the only performance relevant data structures are APL values. There are two kinds of values: small values and large values. small and large refer to the number of ravel elements, i.e. ⍴,VALUE. The borderline between small and large is ./configurable (SHORT_VALUE_LENGTH_WANTED, current default 12). Every value (small or large) has a header (of class Value). For small values the header provides space for the entire ravel while for large values an extra Cell* is allocated with new/malloc for the ravel of the value. All headers have the same size (= sizeof(Value), currently 260 bytes for a SHORT_VALUE_LENGTH_WANTED, of 12). When values are passed as function arguments then a shared pointer (of class Value_P or else const Value &) is used and no allocation/deallocation takes place. The shared pointer is basically the reference counter that deletes the value when the last Value_P pointer is deleted. Unlike shared_ptr<> in C++ (resp. boost) the reference count itself is contained in Value and not in Value_P which simplifies (and speeds up) things a little. For memory allocation from the heap this means: 1.all values are allocated with one new/malloc() call of fixed size (sizeof(Value)). 2. large values will have a second new/malloc allocation with a varying size according to the ravel. Obviously, the optimal method for 1. is a linked list of deleted/free()ed headers with a configurable limit. If we make the limit 5000 or so (which is huge even for complex APL programs) then the space for headers is about 1 megabyte, which is close to nothing these days. Less obviously, the initialization of a ravel of length N requires O(N) cycles while the allocation of that ravel takes between O(1) and O(log N) cycles depending on the allocation method. From that follows that optimization of the O(1) ... O(log N) term cannot have a noticeable speedup and in that case increasing SHORT_VALUE_LENGTH_WANTED will save more cycles than another memory allocator for ravels. And the goodies of the Boehm collector, such as more efficient realloc(), are not used in GNU APL. I will implement the linked list for 1. as soon as time allows. Best Regards, /// Jürgen On 05/29/2014 10:10 PM, Peter Teeson wrote: FWIW Apple moved away from GC and instead uses ARC <=> Automatic Reference Counting. In my experience it has radically reduced the MRR <=> Manual Retain Release effort for the programmer. If anyone is interested here is a link that may be of interest. <https://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/MemoryMgmt/Articles/MemoryMgmt.html> Peter On 2014-05-29, at 1:14 PM, "David B. Lamkins" <address@hidden> wrote:I'll add to that: The Boehm collector is in the family of "conservative" collectors. Briefly, since there's no way to give the collector a precise list of roots to the data structures in the heap (this is true, for example, of pointers on the stack and in structures), the collector instead uses the hueristic of testing the value of a machine word to see whether it contains an address in the heap; if the word passes this test, it's assumed to be a pointer. A conservative GC therefore won't collect any object that's still live, but it *may* hold onto chunks of memory that aren't actually live. (You might have a scalar value, for example, that "looks like" a pointer.) The end result is a space leak. In practice, this is *usually* not a problem. I have seen production systems, though, in which a conservative GC caused catastrophic space leaks under the right (wrong) conditions. A reference-counting collector won't have these problems. Additionally, the work done by a reference-counting collector is evenly amortized over the execution of the program, whereas a mark-sweep collector (like Boehm) causes a pause (sometimes significant, depending upon circumstances of the size of the heap, state of the heap, speed of the machine, etc.) at unpredictable intervals for collection. A proper GC can be an attractive alternative to reference counting in the case where there may be circularity in references. I don't believe that circularity of references is an issue in APL, unless the implementer has explicitly tried to do clever sharing optimizations. On Thu, 2014-05-29 at 10:38 -0500, Blake McBride wrote:Dear Elias, I have experience with the Boehm collector and thought I would share some opinions. 1. Since C/C++ was not designed with GC in mind, the Boehm collector is forced to take into account many very system specific factors. These factors vary widely from system to system, and from system release to system release. This means that, rather than portable code, you are maximally system dependent. While that may be fine as long as the Boehm team continues to support their collector, all of your code will fall on its face the day they don't. In other words, anything built with the Boehm collector will from then on be completely dependent on them. The source code they provide, while complete, is pretty useless unless you understand all of the very system specific nuances of each system they support. In other words, you can't support it yourself. 2. A GC like Boehm would never be as fast as reference counting. Reference counting, when possible, only does what it needs to do at the time it needs to do it. A generic GC has to use up a specified amount of memory and then check everything. I know generational GC's and concurrent GC's minimize some of this but it still takes time and their assumptions are rarely completely accurate so their function is reduced, i.e. _conservative_ collector. The Boehm collector is a great solution when you want to add GC to a system after-the-fact, or if you are unwilling to do the hard work yourself. It seems to me that Jurgen has already done the hard work in a portable way. I'd prefer not to mess with it, nor pepper the system with Boehm ifdefs. Just an opinion (from someone who has used the Boehm collector and has written my own collectors). Thanks. Blake On Thu, May 29, 2014 at 9:24 AM, Elias Mårtenson <address@hidden> wrote: I've been playing around with the Boehm (Böhm, I suppose) garbage collector, and it's pretty amazing. Jürgen, do you have any experience with it? Updating the GNU APL code to (conditionally, with a compiler option) to support using the collector for at least Value objects should be fairly simple. This would provide several benefits: * A performance improvement since there is no need to deal with the reference counter. * Improved parallelism if one wants to expand threading beyond simply paralellising simple loops (no need to deal with race conditions when accessing the reference count) * Provide a more solid ground when attempting to reduce excessive cloning (with the GC, once can simply share everything and clone before modification) I was thinking of starting to implement this, but it would take more than a few hours which means that I first would like to hear your input on it before wasting my time. Regards, Elias |
[Prev in Thread] | Current Thread | [Next in Thread] |