Re: [Bug-apl] About the reduction of clone() calls

On 28 April 2014 22:24, Juergen Sauermann <address@hidden> wrote:

Hi,

generally speaking unnecessary clone() of values should of course be avoided.

In GNU APL 1.0 and 1.1 there was a flag-based system of value ownership where the last owner
would delete the value when giving up its interest in the value. This system began like the
tmp flag, but then caused stale values on one hand and segfaults for values released too early on the
other hand all over the place.

I then changed to Value_P which is pretty much a shared pointer with reference counting. Initially
I tried the standard shared_ptr<> class of C++ but that caused some compiler/portability issues.
That change has brought a lot of stability into GNU APL,

As you have noticed yourself, going through the code and changing is all over the place is bad and
error-prone, and it would kind of undo the change from the flag-system to Value_P (which was also
a huge effort as you may figure from SVN).

So the only solution remaining is rewriting the Value (or. more likely, the Value_P) class.
Now, the Value class knows how many owners it has (the share_count below already exists, it is called owner_count).
But it does not know who they are. If someone does X[Y]←Z then symbol X would need to clone its current value
when Value::index(Z) os performed, but Value::index(Z) does not know anything about Symbol X and other owners
of the value,

I will send you the 170+ GNU APL testcases so that you can quickly check if some change you make is breaking
something else.

Having several Values pointing to the same ravel is also a bit obscure because the Cells of the ravel need to be
released (sub values and complex numbers need to be deleted) which is almost impossible if you have nested
ravels or ravels of of different sizes.

I also doubt that you can gain 3-4 orders of magnitude because a value is normally only cloned very few times.
That does not mean that you can't prove otherwise, I can speed up +/⍳N by 6 orders of magnitude but that does
not prove that this is a valuable optimization,

/// Jürgen

On 04/28/2014 02:57 PM, Elias Mårtenson wrote:

Hello Jürgen,

I don't know if you have given this issue any thought, but it has certainly occupied my mind for the last few days.

It's clear that heavy array processing does far too much cloning than should be necessary. Especially in cases where you have lots of operations on smaller arrays (as opposed to few operations on large arrays).

This is because the code always performs a clone prior to a destructive operation because at that point it can never be sure that the array will not be used again. The (very few) exceptions to this is handled by the "temp" system, which is really only used in the ravel function.

The way I see it, there are two approaches to solving this: The first one being to go through the code with a fine-toothed comb and implement the temp system everywhere. This is lots of work and is error-prone.

The other solution would be to re-engineer the Value class so that it can share underlying data with other Value instances. The Value class would then get a counter called share_count or something, indicating how many other references there are to the same data. When clone() is called, it will simply create a new Value instance, share the underlying data and increase share_count. Any destructive operations would only copy the content if it's not already shared.

Now, Value_P already implements some of these semantics. Could it be reused for this?

While appealing, this copy-on-write solution might not be perfect though. The assumption is that the caller would have decremented the share count (effectively "releasing" the Value) before the called function tries to modify it. Now, this "release" is similar to the the "temp" marking. Could there be a better way?

I've been experimenting with this on and off, and my interim results (as I've mentioned before) show that the potential performance boosts are massive. We're talking about 3-4 orders of magnitude. Definitely worth quite a lot of effort, IMHO.

I'm likely going to continue pursuing this, because I personally feel frustrated when I do something and it's not instantaneous, knowing that I'm waiting for the interpreter to perform unnecessary operations. :-)

What are your thoughts on this? What would be the best approach?

Regards,

Elias

From:	Elias Mårtenson
Subject:	Re: [Bug-apl] About the reduction of clone() calls
Date:	Mon, 28 Apr 2014 22:28:27 +0800