bug-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Another multithreading bug (I think)


From: Richard Frith-Macdonald
Subject: Re: Another multithreading bug (I think)
Date: Thu, 7 Sep 2006 10:34:42 +0100


On 6 Sep 2006, at 18:40, Wim Oudshoorn wrote:

I suggested usiing the refGate, not the connection table lock,
bug using the refGate is horribly wrong and I was quite stupid to suggest that.
Glad you didn't follow my suggestion.

But using the connnection table lock, might not be wise, because:

1 - in order to make the access to the retaincount safe, both
    retain and release need to be guarded by the same lock.
2 - the connection_table_gate is a non-recursive lock.

Changed to be recursive.

3 - the connection_table_gate should NOT change to a recursive lock
    because it guards a hash table.

That's not a problem because the table does not retain its contents, so removing/adding an object has no side effects.

4 - if in -release the final call holds
    the connection_table_gate, it most likely will block on:
    either a recursive retain/release or on the invalidate method
    who will also try to obtain the lock.

Doesn't happen with a recursive lock.

5 - release should hold the lock during the whole release
    method because otherwise there is still a chance of
    failing to finalize the nsconnection.

Yep.

I actually missed point 1 ... so my fix would not work until/unless I added a lock on the table in the retain method.

I'm fairly sure that, with locking in retain, the code would be OK ... but it's very inelegant to add extra locking in every retain and release, so i thought about it some more.

I believe the current mechanism dates from ancient times when NSDecrtementExtraRefCountWasZero() allowed the reference count to go negative ... which made it hard to recover from the point when - dealloc is called. Now it is safe for a -dealloc implementation to decide that the actual deallocation should be cancelled. So, I think a more elegant solution here would be to scrap the special processing in -release and have -dealloc do the work:

1. lock the connection table lock
2. check the current retain count
3a. ... if it is greater than 1, then something else has retained the object ... so we cancel the dealloc and unlock 3b. if it is 1, we remove the connection from the table, unlock, and proceed to finalize/deallocate

But ... I think this reveals a bug in NSObject ... the -retain and - release implementations are not currently entirely thread-safe. They handle the testing and changing of the reference counts in a thread- safe manner, but during release there is a gap between testing/ decrementing the reference count and calling the -dealloc method. Normally this is not an issue since, when the reference count goes to zero, there is only one thread owning the object. However, in the case where we are storing pointers to non-retained objects, we can get a thread retaining the object via such a pointer in the gap before -dealloc is called.

I'm going to look into reorganising the code a little to close that loophole.

Having said this,  I am not able to check your fix right now because
I don't have svn on my computer.  (Need to install that.)

Also, I tried adding a new lock just for retain and release, but
this seems to make the application hang with a garbled stack trace :-(

Friday I think I will be able to see your diff and test it in our application.
And thank you for looking into this.

Wim Oudshoorn.

P.S.:  Did you look at my remark about the win32 message port locking?

I saw the email, but haven't had a chance to reboot in windows and look at it properly yet.
I hope to be able to do that this weekend.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]