parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Slow start to cope with load


From: Jay Hacker
Subject: Re: Slow start to cope with load
Date: Thu, 22 Mar 2012 11:48:12 -0400

Perhaps this is a bit simplistic, but what if you took your idea and
also kept a running estimate of the amount of load added by each job?
Start out assuming each job adds 1 unit of load, and then measure:
"Okay, I started 4 jobs last time, and the load went up by 8, so I
estimate each job causes 2 units of load."  Then when you sample the
difference and current load is say 12, with 16 procs, you'll only add
2 jobs, and the load doesn't go over the max.

I'm not sure exactly how to calculate it, but a first stab might be
load_per_job = current_load / job_slots, and then job_slots +=
(desired_load - current_load) / load_per_job.  Really you probably
want a moving average.  But something like that could let you learn
how your jobs affect the system.

-John


On Thu, Mar 15, 2012 at 8:32 PM, Ole Tange <tange@gnu.org> wrote:
> Thomas got me thinking.
>
> One of the problems with --load is that it only limits how many jobs
> are started. So you may start way too many. This will give you a load
> of 100:
>
>  seq 100 | nice parallel -j0 --load 2.00 burnP6
>
> and that is most likely not what you want.
>
> While some programs run multiple threads (and thus can give a load > 1
> each) that is the exception. So in general I think we can assume one
> job will at most give a load of 1.
>
> Currently load is only computed every 10 seconds. So we could
> recompute every 10 seconds:
>
>    number_of_concurrent_jobs = max_load - current_load +
> number_of_concurrent_jobs
>
> If the job immediately takes 100% CPU time (like burnP6) then the
> number of processes will grow every 10 seconds with the difference
> between current load and max load. As the load lags behind it may
> cause us to spawn too many processes that will cause a load > max
> load. But when the jobs finish the the load will over time drop to the
> max load.
>
> If the job never takes 100% CPU time (like host) then the number of
> processes will grow every 10 seconds with the difference between
> current load and max load.
>
> If the job takes 100% CPU time after some initialization (like blast)
> then the number of processes will grow every 10 seconds with the
> difference between current load and max load. The current load will
> start out small, this may cause us to spawn too many processes that
> will cause a load > max load.
>
> If the job takes >100% CPU time after some initialization (like
> multithreaded blast) then the number of processes will grow every 10
> seconds with the difference between current load and max load. The
> current load will start out small, this may cause us to spawn too many
> processes that will cause a load > max load.
>
> I believe it would be better than the current, but I am very open to
> even better ideas.
>
>
> /Ole
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]