certi-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [certi-dev] Cleanup in RTIA when federate crashes


From: Timi Tuohenmaa
Subject: Re: [certi-dev] Cleanup in RTIA when federate crashes
Date: Wed, 13 Aug 2014 15:30:49 +0300

2014-08-12 16:56 GMT+03:00 Eric Noulard <address@hidden>:
>
>
>
> 2014-08-05 15:38 GMT+02:00 Timi Tuohenmaa <address@hidden>:
>>
>> Hi,
>
>
> Hi Timi,
>
> Just coming back from holiday leave.

I assumed you would be on holiday :)

>>
>>
>> I sent you a patch (https://savannah.nongnu.org/patch/?8494) to make
>> RTIA.exe exit when federate crashes. Now I noticed that while it does
>> make RTIA act similarly in Linux and Windows it does not actually
>> solve the problem.
>>
>> Reason is that since RTIA does quit gracefully it also makes RTIG
>> think that federate exited normally. And therefore it does not do
>> killFederate stuff (like unsubscribe, unpublish and object attribute
>> releases).
>
>
> OK ... I see.
>
>>
>>
>> To be honest I think killFederate is something that should be done
>> even in normal federate exit, but I am not sure if HLA specs say
>> anything about automatic unpublish and unsubscribe etc. At least in
>> Portico unsubscribe etc is done automatically every time federate
>> quits for what ever reason.
>
>
> Automatic unsubscribe seems reasonable.
> Automatic unpublish wouldn't seem wise though since
>  a failover federate may want to acquire ownership of such object instance
> throuh the "Attribute Ownership Acquisition service".

I don't think I understand.
If federate resigns then it should not exist in RTI in any way. In
CERTI it leaves
some "trash" to RTIG (unless I manually unpublish) and causes many kind of
problems (I can't subscribe or publish anything if I join to
federation that has been
part of exiting fedederation that haven't unpublished everything). I fail to see
how automatic unpublish changes ownership acquisitions later on.

Attributes that were owned while federate does unpublish does get unowned so
any federate can acquire ownerships for them. I couln't instantly find
what standard
says about unpublishing privilegeToDelete (implicitly) ie. would
object be deleted
or released though.

Simply put federation breaks unless federate is cleaned correctly either by
manually using unpublish etc or by doing something that causes killFederate.
All publishes and subscripts will fail after one invalid exit. If I remember
correctly unsubscribe was not needed in practise, but unpublish was critical
for keeping RTIG working correctly.

Or am I the only one who has this problem?

>>
>>
>> I'm not able to test in Linux just now, but I think simple program
>> that subscribes some attributes and then crashes (and then RTIA
>> exiting automatically) is not working well either.
>
>
> In which sense you think it's not working?

In sense that RTIA will send clean exit to RTIG and then RTIG does not execute
killFederate and then subscriptions and publications will not be cleaned.

And if I am right the federate can't join back AND subscribe again.

>>
>>
>> There would be two alternative solutions for this problem.
>>
>> One would be to add flag to Communications class that makes sure that
>> NM_Close_Connexion is not sent in destructor in case federate crash
>> has occured.
>>
>> Other would be to change RTIG to do full cleanup in every case when
>> federate quits (normally or by crash). I don't see reason why it is
>> not done in all cases. The federation becomes broken if federate exits
>> without unsubscribing and unpublishing correctly anyways.
>
>
> I don't think ownership may be [theoretically] requested by another
> federate.
> I'll think more about that but I shall confess it's not too high on my
> priority list
> for now.

In my understanding it is fine that no federate owns privilegeToDelete (at first
it is owned by federate who created the object, but if it
uncoditionally releases
it then ownership of that attribute can be requested from RTI). Also any
federate can request ownership of that attribute and therefore "ownership of
object".

Anyway killFederate does remove owned objects and that is ok too.

The base thing is that unless federate exits exactly like expected the RTIG
gets somewhat broken.

I can look if I can come with some patches for this, but it's no use unless we
get to the same page that there is a problem. At least the first
solution is really
needed to make CERTI handle crashes correctly. The second way to patch
would solve other issues too.

>
>
> --
> Erk
> L'élection n'est pas la démocratie -- http://www.le-message.org
>
> --
> CERTI-Devel mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/certi-devel
>


-Timi



reply via email to

[Prev in Thread] Current Thread [Next in Thread]