[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Fab-user] trivial put test case fails?

From: Christian Vest Hansen
Subject: Re: [Fab-user] trivial put test case fails?
Date: Mon, 23 Feb 2009 00:37:58 +0100

On Sun, Feb 22, 2009 at 11:23 AM, Niklas Lindström <address@hidden> wrote:
> Hi!
> Christian got it! I moved the joins so it ends up like::
>    # Close channel when done
>    status = chan.recv_exit_status()
>    chan.close()
>    # Wait for threads to exit before returning (otherwise we will occasionally
>    # end up returning before the threads have fully wrapped up)
>    out_th.join()
>    err_th.join()
>    return ("".join(capture).strip(), status == 0)
> And it worked. It hasn't stalled during my runs. (And seems like a
> logical order of things.)
> * The fabric process just hangs there without churning the CPU.

As expected. Fabric did a call to Channel.recv (or recv_stderr) which
is blocking when no data is available, but closing the channel causes
them to return empty strings. These empty strings causes an outputter
thread to halt. Our trouble was that the code (in sudo) would never
close the channels until the outputter thread had completed. This left
an chan.exit_status_ready() check as the only means by which an
outputter thread could terminate, and this check was placed *after*
the recv() call, thus creating a dead-lock opportunity.

I have pushed some changes that should prevent this type of dead-lock
from happening, while making sure that all of the output from the
commands gets processed.

Please confirm that these changes indeed fixes it.

> * Most stuff (mv, rm, untar etc.) worked just fine. So did doing a
> restart (using stop+start) with apache2ctl (and the
> "etc/init.d/apache2" version). Maybe it was about the length of the
> operation (restarting tomcat takes a couple of seconds)?..

I am mildly surprised that it apparently worked most of the time. From
looking at the code, I would expect the probability of dead-locking to
be rather high.

> Anyway, moving the joins after `chan.close()` seems to have done it,
> so I'm happy. :)

This, the `chan.close()` part, was very useful to know when debugging. Thanks.

> Best regards,
> Niklas

Venlig hilsen / Kind regards,
Christian Vest Hansen.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]