Re: [Bug-apl] Boehm garbage collector

From:

Juergen Sauermann

Subject:

Date:

Sat, 31 May 2014 19:47:25 +0200

User-agent:

Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130330 Thunderbird/17.0.5

Hi,

I have implemented the linked list of deleted values as discussed below. SVN 305.

If you want to benchmark it change line 509 of Value.hh to enable (#if 1) or disable
(#if 0) the new memory management.

/// Jürgen

On 05/31/2014 03:25 PM, Juergen Sauermann wrote:

Hi,

in GNU APL the only performance relevant data structures are APL values.
There are two kinds of values: small values and large values. small and large
refer to the number of ravel elements, i.e. ⍴,VALUE. The borderline between
small and large is ./configurable (SHORT_VALUE_LENGTH_WANTED, current default 12).

Every value (small or large) has a header (of class Value). For small values the header
provides space for the entire ravel while for large values an extra Cell* is allocated with
new/malloc for the ravel of the value.

All headers have the same size (= sizeof(Value), currently 260 bytes for a SHORT_VALUE_LENGTH_WANTED, of 12).
When values are passed as function arguments then a shared pointer (of class Value_P or else const Value &) is
used and no allocation/deallocation takes place. The shared pointer is basically the reference counter that deletes the
value when the last Value_P pointer is deleted. Unlike shared_ptr<> in C++ (resp. boost) the reference count itself
is contained in Value and not in Value_P which simplifies (and speeds up) things a little.

For memory allocation from the heap this means:

1.all values are allocated with one new/malloc() call of fixed size (sizeof(Value)).
2. large values will have a second new/malloc allocation with a varying size according to the ravel.

Obviously, the optimal method for 1. is a linked list of deleted/free()ed headers with a configurable
limit. If we make the limit 5000 or so (which is huge even for complex APL programs) then the space for
headers is about 1 megabyte, which is close to nothing these days.

Less obviously, the initialization of a ravel of length N requires O(N) cycles while the allocation of that
ravel takes between O(1) and O(log N) cycles depending on the allocation method. From that follows that
optimization of the O(1) ... O(log N) term cannot have a noticeable speedup and in that case increasing
SHORT_VALUE_LENGTH_WANTED will save more cycles than another memory allocator for ravels.

And the goodies of the Boehm collector, such as more efficient realloc(), are not used in GNU APL.

I will implement the linked list for 1. as soon as time allows.

Best Regards,
/// Jürgen

On 05/29/2014 10:10 PM, Peter Teeson wrote:
FWIW Apple moved away from GC and instead uses ARC <=> Automatic Reference Counting.
In my experience it has radically reduced the MRR <=> Manual Retain Release effort for the programmer.
If anyone is interested here is a link that may be of interest.
<https://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/MemoryMgmt/Articles/MemoryMgmt.html>

Peter

On 2014-05-29, at 1:14 PM, "David B. Lamkins" <address@hidden> wrote:
I'll add to that:

The Boehm collector is in the family of "conservative" collectors.
Briefly, since there's no way to give the collector a precise list of
roots to the data structures in the heap (this is true, for example, of
pointers on the stack and in structures), the collector instead uses the
hueristic of testing the value of a machine word to see whether it
contains an address in the heap; if the word passes this test, it's
assumed to be a pointer.

A conservative GC therefore won't collect any object that's still live,
but it *may* hold onto chunks of memory that aren't actually live. (You
might have a scalar value, for example, that "looks like" a pointer.)
The end result is a space leak.

In practice, this is *usually* not a problem. I have seen production
systems, though, in which a conservative GC caused catastrophic space
leaks under the right (wrong) conditions.

A reference-counting collector won't have these problems. 

Additionally, the work done by a reference-counting collector is evenly
amortized over the execution of the program, whereas a mark-sweep
collector (like Boehm) causes a pause (sometimes significant, depending
upon circumstances of the size of the heap, state of the heap, speed of
the machine, etc.) at unpredictable intervals for collection.

A proper GC can be an attractive alternative to reference counting in
the case where there may be circularity in references. I don't believe
that circularity of references is an issue in APL, unless the
implementer has explicitly tried to do clever sharing optimizations.

On Thu, 2014-05-29 at 10:38 -0500, Blake McBride wrote:
Dear Elias,


I have experience with the Boehm collector and thought I would share
some opinions.


1.  Since C/C++ was not designed with GC in mind, the Boehm collector
is forced to take into account many very system specific factors.
These factors vary widely from system to system, and from system
release to system release.  This means that, rather than portable
code, you are maximally system dependent.  While that may be fine as
long as the Boehm team continues to support their collector, all of
your code will fall on its face the day they don't.  In other words,
anything built with the Boehm collector will from then on be
completely dependent on them.  The source code they provide, while
complete, is pretty useless unless you understand all of the very
system specific nuances of each system they support.  In other words,
you can't support it yourself.


2.  A GC like Boehm would never be as fast as reference counting.
Reference counting, when possible, only does what it needs to do at
the time it needs to do it.  A generic GC has to use up a specified
amount of memory and then check everything.  I know generational GC's
and concurrent GC's minimize some of this but it still takes time and
their assumptions are rarely completely accurate so their function is
reduced, i.e. _conservative_ collector.


The Boehm collector is a great solution when you want to add GC to a
system after-the-fact, or if you are unwilling to do the hard work
yourself.  It seems to me that Jurgen has already done the hard work
in a portable way.  I'd prefer not to mess with it, nor pepper the
system with Boehm ifdefs.


Just an opinion (from someone who has used the Boehm collector and has
written my own collectors).


Thanks.


Blake








On Thu, May 29, 2014 at 9:24 AM, Elias Mårtenson <address@hidden>
wrote:
       I've been playing around with the Boehm (Böhm, I suppose)
       garbage collector, and it's pretty amazing. Jürgen, do you
       have any experience with it?


       Updating the GNU APL code to (conditionally, with a compiler
       option) to support using the collector for at least Value
       objects should be fairly simple. This would provide several
       benefits:
             * A performance improvement since there is no need to
               deal with the reference counter.
             * Improved parallelism if one wants to expand threading
               beyond simply paralellising simple loops (no need to
               deal with race conditions when accessing the reference
               count)
             * Provide a more solid ground when attempting to reduce
               excessive cloning (with the GC, once can simply share
               everything and clone before modification)
       I was thinking of starting to implement this, but it would
       take more than a few hours which means that I first would like
       to hear your input on it before wasting my time.


       Regards,
       Elias

[Prev in Thread]

Current Thread

[Next in Thread]