parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Uses of GNU Parallel


From: Prince Sibanda
Subject: Re: Uses of GNU Parallel
Date: Tue, 14 Feb 2017 18:02:23 +0200

So i think tools like Slurm have more of an emphasis on system resource management. You manage system resources using job queues, time, etc. I think my use-case is related, hence somebody though of Slurm and likes. However, the management i need here is data oriented: ie i want to use data to determine the jobs, as opposed to using job queues, quotas, and other such system environment resources. That is, i want to have jobs determined by what kind of data i have, and not so much jobs determined or directly limited by the kind of system resources(CPUs, GPUs, time/disk quota, etc). The system resources are used indirectly. The tool i need here is a tool to help me synthesize the jobs based on rules i describe using the tool. Once the jobs are described, then the system resources begin to matter. The jobs i describe here are dynamic- they can change depending on what data is coming into this tool. As data comes into the tool, jobs are created according to the provided to the tool, but some data can come in that causes existing jobs to be cancelled, or changed to be processed differently than was planned at initial setting,etc.

It sounds like something you could probably do with the combination of BAsh and Parallel, but expressing this with BAsh is can be very hard to get right- in the same sense that writing your code in 0s and 1s is hard even though not impossible.

Regards,
Prince

On Mon, Feb 13, 2017 at 6:38 PM, Rob Sargent <robjsargent@gmail.com> wrote:


On 02/13/2017 05:45 AM, Ole Tange wrote:
On Mon, Feb 13, 2017 at 11:11 AM, Prince Sibanda
<1princesibanda@gmail.com> wrote:

However, once one of these two cases starts running, i want to be able to
issue interactively a command to stop feeding certain types of files from
the joblist. I also want to be able to prioritise the jobs in joblist so
that those are run first. I would also like to be able to insert new jobs
into the joblist with a certain priority level, so that if the inserted is a
high priority job for example, it is run next as soon as any of the
currently running jobs has finished. I would like to be able to say skip a
certain job, or repeat a certain job, take a certain job out of joblist,
etc. All this i want to be able to do when one of those two cases has
already started running.
You are describing a job queue system.

GNU Parallel was not built as a job queue system, but can be used as a
very minimal queue.

GNU Parallel is not designed for interactivity - it has very few
interactive features. It is not designed for removing jobs from the
queue, and it has no concept of a priority level.

Extending GNU Parallel to a proper job queue system is outside the
scope of GNU Parallel, and even if someone made a patch for this, I
would probably be reluctant to include it - as it would have to
re-write huge sections of GNU Parallel.

Some GNU Parallel users use Slurm. I would imagine GNU Parallel is
useful for generating and submitting jobs to Slurm, and I would be
open to making a few changes to make GNU Parallel interface better
with Slurm, if there are obvious improvement ideas.

Slurm already has the concept of priority, and it is possible to
remove jobs from the queue, so my guess is that it will be easier for
you to extend Slurm to meet your needs, and I encourage you to see if
Slurm or some of the alternatives meet your needs already.

Other alternatives include Torque and Rocks.


/Ole


We use slurm and parallel, but perhaps in reverse to what Ole has in mind.  Here slurm is the queue and job manager to clusters of anonymous machines.  Our slurm jobs start parallel jobs in available hardware, which in turn occupy all available processors .




reply via email to

[Prev in Thread] Current Thread [Next in Thread]