parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problem with thousands of small jobs.


From: Ole Tange
Subject: Re: Problem with thousands of small jobs.
Date: Fri, 6 Feb 2015 21:42:38 +0100

>> On Fri, 6 Feb 2015 23:20 xmoon 2000 <xmoon2000@googlemail.com> wrote:
>>>
>>> I need to run about 4,000 jobs that each take around 20 seconds to
>>> complete
:
>>> On my 32 core machine works OK,   BUT there is a "lull" in processing
>>> every few seconds as new jobs are started, once the current crop have
>>> completed. I assume this is due to an overhead in starting jobs that
>>> is only noticable because my jobs are so short.
>>>
>>> Is there any way I could make this more efficient, so my cores are
>>> fully utilised and getting through the whole process is faster?

On Fri, Feb 6, 2015 at 3:56 PM, xmoon 2000 <xmoon2000@googlemail.com> wrote:
> I am running this in cygwin. I call parallel from the standard
> window/shell supplied by cygwin.

My experience with Cygwin is limited, but I seem to remember that
spawning is expensive (in the order of 0.1 second), which is one of
the reasons for recommending MSYS. So if your jobs all finish at the
same time, GNU Parallel will need 3.2 seconds to start 32 new jobs.
Maybe you will have a better experience if you delay starting jobs by
0.1 seconds:

  cat /tmp/parList | parallel -j 28 --delay 0.1 --eta;

What is your reason for not using all 32 cores?

If it is not due to the slow spawning, it could simply be due to disk
IO when printing. The standard reply for that is simply increase the
number of parallel jobs until all CPUs are 100% loaded (or disk IO is
killing your performance):

  cat /tmp/parList | parallel -j 40 --delay 0.1 --eta;


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]