parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problem with thousands of small jobs.


From: xmoon 2000
Subject: Re: Problem with thousands of small jobs.
Date: Sat, 7 Feb 2015 09:11:57 +0000

On 6 February 2015 at 20:42, Ole Tange <ole@tange.dk> wrote:
>>> On Fri, 6 Feb 2015 23:20 xmoon 2000 <xmoon2000@googlemail.com> wrote:
>>>>
>>>> I need to run about 4,000 jobs that each take around 20 seconds to
>>>> complete
> :
>>>> On my 32 core machine works OK,   BUT there is a "lull" in processing
>>>> every few seconds as new jobs are started, once the current crop have
>>>> completed. I assume this is due to an overhead in starting jobs that
>>>> is only noticable because my jobs are so short.
>>>>
>>>> Is there any way I could make this more efficient, so my cores are
>>>> fully utilised and getting through the whole process is faster?
>
> On Fri, Feb 6, 2015 at 3:56 PM, xmoon 2000 <xmoon2000@googlemail.com> wrote:
>> I am running this in cygwin. I call parallel from the standard
>> window/shell supplied by cygwin.
>
> My experience with Cygwin is limited, but I seem to remember that
> spawning is expensive (in the order of 0.1 second), which is one of
> the reasons for recommending MSYS. So if your jobs all finish at the
> same time, GNU Parallel will need 3.2 seconds to start 32 new jobs.
> Maybe you will have a better experience if you delay starting jobs by
> 0.1 seconds:
>
>   cat /tmp/parList | parallel -j 28 --delay 0.1 --eta;
>
> What is your reason for not using all 32 cores?
>
> If it is not due to the slow spawning, it could simply be due to disk
> IO when printing. The standard reply for that is simply increase the
> number of parallel jobs until all CPUs are 100% loaded (or disk IO is
> killing your performance):
>
>   cat /tmp/parList | parallel -j 40 --delay 0.1 --eta;
>
>
> /Ole

Ole,

1. Reason for not using 32 cores is then I can't do anything else. I
need to do emails, check web, simple stuff whilst waiting for jobs to
finish.

2. I will try your delay - see if that helps.

3. When I run my own bash script which does a very simple approach to
spawning jobs, then I do start upto 50 jobs at a time. However,
counting how many processes are active on my jobs on cygwin/windows is
time-expensive and not very accurate! (Well, for me anyway because I
am staring so many little jobs).

Moon



reply via email to

[Prev in Thread] Current Thread [Next in Thread]