Re: Slow start to cope with load

parallel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Slow start to cope with load

From:	Ole Tange
Subject:	Re: Slow start to cope with load
Date:	Mon, 19 Mar 2012 11:25:52 +0100

On Mon, Mar 19, 2012 at 10:20 AM, Matt Oates (Home) <mattoates@gmail.com> wrote:
>
> On 16 March 2012 00:32, Ole Tange <tange@gnu.org> wrote:
> > One of the problems with --load is that it only limits how many jobs
> > are started. So you may start way too many. This will give you a load
> > of 100:
> >
> >  seq 100 | nice parallel -j0 --load 2.00 burnP6
> >
> > and that is most likely not what you want.
>
> Am I wrong in thinking you can just do -j 100% so that you never spawn
> more than maxload processes assuming one process load 1.0 on a single
> core? Can you not use -j 100% in conjunction with --load to prevent
> the overload on startup?

For CPU hungry programs like 'burnP6' that would be true. But if the
program only uses 10% CPU (because it is waiting for network or disk
I/O), then we should be able to spawn more - preferably automatically
figuring out the "right" amount.

> > While some programs run multiple threads (and thus can give a load > 1
> > each) that is the exception. So in general I think we can assume one
> > job will at most give a load of 1.
>
> It would be nice to explicitly state the likely load per process
> though especially if you are the one setting it. I frequently run hmm
> building with concurrent threading per process and just do the maths
> myself, and am lucky that all the hosts have the same number of CPUs.
> Perhaps a flag like --is-threaded=4  or something to indicate the
> likely load per job?

I am not too happy about that. I would much prefer some automated way
of doing-the-right-thing.

> > Currently load is only computed every 10 seconds. So we could
> > recompute every 10 seconds:
> >
> >    number_of_concurrent_jobs = max_load - current_load +
> > number_of_concurrent_jobs
>
> Looks good, though I have a couple of questions: If this is negative
> are you going to kill processes rather than start them? What if it's
> always 0 even from the start are you just never going to run on this
> host?

As a user I would be very surprised if GNU Parallel started to kill my
jobs, and I try to design GNU Parallel adherring to POLA:
http://en.wikipedia.org/wiki/Principle_of_least_astonishment

So if it is < 1 it would mean: Do not spawn more new jobs, but wait
for jobs to complete.

> > I believe it would be better than the current, but I am very open to
> > even better ideas.
>
> You are starting to get into the realm of needing to understand
> scheduling per host... Load might be reported for something with a
> different nice value than what you want to submit. So 100% load for
> something with <0 nice and you want to put something in for +19. In
> your equation above I would just add in something looking at the
> difference between parallel's jobs that are running and those that are
> ready/waiting. If all our jobs are running even under high load who
> cares, we have priority here so keep up with the max load. If half of
> our jobs are waiting then we might as well reduce spawning by half.

I did not understand this part.

> Best,
> Matt.

/Ole

[Prev in Thread]

Current Thread

[Next in Thread]

Slow start to cope with load, Ole Tange, 2012/03/15
- Message not available
  - Re: Slow start to cope with load, Ole Tange <=
    - Re: Slow start to cope with load, Matt Oates (Home), 2012/03/19
    - Re: Slow start to cope with load, Ole Tange, 2012/03/19
    - Re: Slow start to cope with load, Matt Oates (Home), 2012/03/20
- Re: Slow start to cope with load, Jay Hacker, 2012/03/22
  - Re: Slow start to cope with load, Ole Tange, 2012/03/22
    - Re: Slow start to cope with load, Jay Hacker, 2012/03/23
- Re: Slow start to cope with load, David, 2012/03/19

Prev by Date: Re: transfer and NFS homes
Next by Date: Re: Slow start to cope with load
Previous by thread: Slow start to cope with load
Next by thread: Re: Slow start to cope with load
Index(es):
- Date
- Thread