gnash-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnash-dev] Re: Garbage Collection


From: Chad Musick
Subject: [Gnash-dev] Re: Garbage Collection
Date: Thu, 20 Sep 2007 17:08:17 +0900

> What do you mean by 'unify' ?
> We are currently using the GC for a set of objects and RC
(ref-counting)
> for another set.
> 
I mean that rather than using ref-counting and GC, we only use GC.

> > GC -- the garbage collector itself, a singleton accessed only
through
> > static functions.
> 
> Would this be appropriate even in case we'll support multiple VMs ?
> 
I think even with multiple VMs this is still appropriate.  If the GC is
permanently incomplete (that is, if there is a leak) when a VM stops
being used, having multiple GCs may be able to eliminate these leaks,
but I would suggest we just fix the leak in the first place.

> You mean you want to have *everything* managed by the GC ?
> The choice to keep RC for things w/out circular refs problem was to
> reduce overhead of the marking phase. Of course if your solution
happens
> to run faster no problem here. I guess it's hard to compare w/out an
> implementation though...

Yes, I want *everything* managed by the GC.  The problem with the
overhead of the marking phase is that it has to all happen at once.
Marking is not itself inherently expensive -- you incur the same expense
every time you change the reference count for a refcounted object.  The
problem is that the marking phase stops everything else.  Since I'm
proposing concurrent marking, this is not a problem.

The issue I have with ref-counting is that how do we know what things
are without circular references? Especially if the choice is made on a
per-class basis, the chance of a class ever being involved in a cycle is
not easily calculable. Getting the decision right for now doesn't mean
it will be right for the future.

> > Allocating new GcResources
> > ==========================
> > A GcInvoke object should be on the stack when new GcResource objects
are
> > heap allocated.  The GcInvoke object either 'on' or 'off'. ('on' is
the
> > default). If it is on, the newly allocated memory will be managed by
the
> > Gc. If it is off, the newly allocated memory must be managed by
hand.
> 
> So, to recap, a developer can choose when to leave management of a
GcResource
> to the garbage collector and when to manage outside of it ?

Exactly. And the choice to leave something out of it is not permanent --
you can later decide to put it in.

> > When all GcInvoke objects have gone out of scope (they may be
nested),
> > any GcResource objects which were heap allocated in the scope of an
'on'
> > GcInvoke is added to the list of managed objects.
> > 
> > By constructing in this way, the newly allocated resources don't
face
> > the possibility of collection until they have had a chance to be
placed
> > into a reachable state. 
> 
> What if they get registered when assigned to a GC-aware smart
pointer ?
> Doing so any dumb-pointer will never be registered with the GC and
we'd use
> gc_ptr as class members.

In a GC which marks on a stop-the-world basis, this works well. However,
the problem that the GcInvoke is designed to solve is not 'how do we
ever get the object into the GC', it is 'when?'. This is simply not an
issue if you're never marking and doing something else at the same time.
If you are marking concurrently (or incrementally, even), manually
deciding is not a viable solution.

I'll explain why in the suggested interface:
> In the suggested interface above the code would be something like:
> 
> struct MyMarkableResourceContainer {
> 
>       GcManagedPtr<SomeGcClass> markable_member;
> 
>       void doSomething1()
>       {
>               SomeGcClass* p = new SomeGcClass; // manually managed
>               p->do_something();
>               markable_member = p; // safe from now on ?
This can fail in two ways, and these failures are common to _any_ Gc
which does not stop every thread which can allocate before starting a
mark cycle:
1) The object which contains doSomething1() is not yet markable itself,
so even if it will appropriately mark 'markable_member' when called upon
to do so, it will not be called upon to do so.  doSomething1() has no
way to know whether or not *this is reachable yet, and so no method of
making the decision can escape using some context information. _Any_ Gc
which does not stop all threads which can allocate before it marks will
face this issue.
2) doSomething1() is inside of a reachable object, but it has already
been marked for this cycle, and so markable_member is not marked and
gets deleted at the collection phase of the Gc.
doSomething2() has the same issues.

>       void doSomething3()
>       {
>               std::auto_ptr<SomeGcClass> p ( new SomeGcClass ); //
manually managed
>               if ( p->do_something() )
>               {
>                       markable_member = p.release();  // or = p,
implementing assignmnt op from auto_ptr..
>               }
>               // else, SomeGcClass released by auto_ptr destructor
>       }

This is an interesting case.  The assignment to markable_member has
exactly the same issues as doSomething1() and doSomething2().
Why the assignment to auto_ptr? What do you think of this:
void doSomething3()
{
        GcInvoke myInvoke; // GC managed
        SomeGcClass* p = new SomeGcClass;
        if (p->do_something())
                member = p;
        // Let the Gc worry about p
}
Or this, which uses the auto_ptr:
void doSomething3()
{
        GcInvoke myInvoke(false); // manually managed
        std::auto_ptr<SomeGcClass> p(new SomeGcClass);
        if (p->do_something())
        {
                member = p.release();
                GC::manage(p); // Now Gc managed.
        }
}
(This example made me change the 'manage' function in the GC to use
GcInvoke, since I realized that the problems of doSomething1()
doSomething2() make it unsafe for 'manage' to immediately cause
management -- I was properly handling 2), but not 1))

> > A Caveat: It is NOT safe to spawn a new thread inside the scope of a
> > GcInvoke. (Strictly speaking, any dynamic resources allocated in a
> > spawned thread must provide for their own safety, and not rely on
> > 'this'.)
> > 
> > Marking GcResources
> > ===================
> > Objects do not need to be immutable while they are being marked,
except
> > that the _structure_ of any containers which hold GcResource objects
> > must not be changed while the object is being marked.
> 
> I don't understand this, can you provide an example, and explain why ?

I'll explain both -- they're different issues, and I'm not sure which
you mean.

The Caveat --
GcInvoke uses thread local storage, and so the 'manage' state of
information is dependent on the thread.  Spawning a new thread and
allocating from there isn't safe.
Example:
class func_obj { public: operator() { member = new GcObject; } .... };
Assuming that func_obj properly marks all of its members and such, this
is still not safe, because the allocation of member is not done within a
GcInvoke scope.  Doing this:
someFunction1() { GcInvoke myInvoke; boost::thread T(func_obj()); }
does not make it safe -- the func_obj cannot benefit from the GcInvoke.
This is bad practice anyway.

Not changing structure --

This is okay, without a need to lock:
std::list<GcObject*> myList;
std::list<GcObject*>::iterator i = myList.find("something");
*i = new GcObject;
// Because this does not modify the structure -- ie. does not invalidate
any iterators of the list, this is okay.

This is not okay, unless you lock:
std::list<GcObject*> myList;
std::list<GcObject*>::iterator i = myList.find("something");
myList.erase(i);
// This invalidates i, so the list must be locked to do this.

The reason is that the GC uses the iterators to mark containers and
invalidating the iterators may make the Gc de-reference or increment an
invalid iterator.

I hope that helps,
Chad







reply via email to

[Prev in Thread] Current Thread [Next in Thread]