|
From: | Ido Tal |
Subject: | Re: GNU Parallel Bug Reports [parallel] parallel-20110205 hangs in "make check" step |
Date: | Sat, 12 Feb 2011 23:18:26 -0800 |
On Fri, Feb 11, 2011 at 1:19 AM, Ido Tal <address@hidden> wrote:Yes. In this situation we may see the bug.
> * A user submits many jobs to a queue.
> * The queuing system distributes the jobs between the different computer
> nodes that make up the supercomputer.
> * Each node has its own CPU, but the file system is shared between the
> nodes.
>
> Now, if the jobs sent were making use of parallel with the sem option, then
> it seems the bug would occur. In this case, I think it would make sense for
> each host to have its own file.
First of all I would prefer to have this bug fixed instead of making
some workaround. For that I need help in making something that will
create a lock that will work across multiple machines over NFS.
In my opinion 'sem -j1 --id myname myprg' should guarantee that only
one process is running using the filesystem of $HOME. It is to
guarantee that myprg can do something that would f*ck up if multiple
myprgs were running at the same time. If we change the behaviour so we
only guarantee that there will be a single myprg PER MACHINE, then
that would not do what I would expect as a user.
That is why I am reluctant to just implement a PER MACHINE lock.
It would be helpful to me if you can provide a setup and an example
that will fail every time (or at least most times).
/Ole
[Prev in Thread] | Current Thread | [Next in Thread] |