[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Possible bug -- how to trace the dead-lock?
From: |
Maciej Pilichowski |
Subject: |
Possible bug -- how to trace the dead-lock? |
Date: |
Mon, 13 Dec 2010 09:01:16 +0100 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6 |
Hello,
My case -- one local machine (2 cores), and one remote (4 cores).
By "job" here I am mean launching parallel to process N files. My
single processing looks like this
echo -n ${HOST} Processing file $filename
# here processing is one per one file
echo done
So my statistics is this -- every 5th job ends up with a dead-lock. I
see echo for the last file with "done" message, but the work of entire
job is not done, because I don't see bash prompt. And when I check what
is going on, I only find, that some file from the middle of the list is
not processed.
So how to trace the problem. For me it is parallel issue, because
each processing is independent of each other, and I ran those jobs in a
sequence (w/o parallel) for several years, and didn't have _once_ such
problem.
Also note, that I had no such problem _once_ when running parallel
only locally. The problem shows up only when run in distributed manner.
Kind regards,
- Possible bug -- how to trace the dead-lock?,
Maciej Pilichowski <=