parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: controlling memory use beyond --noswap


From: B. Franz Lang
Subject: Re: controlling memory use beyond --noswap
Date: Wed, 07 May 2014 10:08:53 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0

Hi Sebastian

we do need swap to permit finishing of jobs in some
exceptional instances, and then lots of it. On the other hand, use of swap 
should
be really avoided as it slows down jobs to a crawl, and eats up disks.
I am not convinced to go this route. i.e. let the kernel kill already 
misbehaving
jobs.

Ciao Franz

On 14-05-07 04:33 AM, Sebastian Eiser wrote:

Just a thought, which may be a simple solution, but suiting most people.

The kernel is pretty good at killing misbehaving jobs. @Ole: can you capture SIGKILL from a job? Can you record memory usage shortly after SIGKILL?

If yes, you have an estimate on the minimum memory requirement of the job and it could be relaunched if there is, say, twice the amount available. To be sure other new jobs do not interfere too much, put it to the end of the queue. If other relaunched jobs are running, count their reserved memory if it is less than their currently used memory.

Break only if a job is killed even if it is the only one running or after reaching a maximum iteration number. How would that work out in the scenaria of others?

Some people disable swap deliberately, so using swap as metric might not be 
general enough.

-Sebastian




On Wed, May 7, 2014 at 1:16 AM, B. Franz Lang <Franz.Lang@umontreal.ca <mailto:Franz.Lang@umontreal.ca>> wrote:

    Hi Ole

    you are definitely right for a server with more than one user, and for sets
    of jobs that are very different in memory use within the given set. Maybe 
reduce
    the problem to:

    - sets of jobs have a similar memory usage
    - there is only one major user
    - this user reserves enough memory for all combined small jobs so that one 
does not need
    Â  to worry about them, including occasional other users. This would allow 
even for
    Â  very different-size memory jobs within a set, as long as they all fit 
into the allocated
    Â  space. In fact, the one(s) that do not fit could be just left out, and 
one would still have
    Â  the information from the successful runs to plan ahead or even have 
enough information
    Â  already at hand.


        The best I have come up with is the ulimit thing where you accept that
        jobs will be killed and restarted later, if they prove to take up too
        much ram. But that is far from ideal: You could have a situation where
        one 25 GB job would finish fine because it happened to run parallel
        with tiny jobs.

        I do not know of a bullet proof way to figure out how much memory a
        job + its children take up. But maybe we could monitor swap:

        Â  Â If swapout > 0: Don't care. There is no problem in a machine 
swapping out
        Â  Â If swapin > 0: Don't care. There is no problem in a machine 
swapping in
        Â  Â If (swapin*swapout > limit) 2 seconds in a row:
        Â  Â  Â The machine is swapping in and out: This is a problem
        Â  Â  Â Kill the newest started job and put it back in the queue and 
wait
        until at one job is finished to start another.

    As indicated above, what about leaving a comfortable portion of memory
    for small jobs (user-defined value or percentage), so that any swapping
    can be safely attributed to the biggies. It would be then only those that 
you would kill and
    restart.
    Btw. a good example might be velvetg which you probably know. It may spend
    lots of time with single-threaded calculations (depending on the dataset),
    and if you wish to scan across twenty kmers it can be rather tiring.
    I usually run it in parallel after estimating time from a single kmer run, 
or just
    based on an educated guess. It is still difficult to optimize for full use 
of a
    server and things may end without grace. While many runs are perfect, 
others take even longer
    as the machine starts to work within swap space (and grind up the hard
    disk :-(. It really needs dropping and later restarting of big jobs. I 
already
    managed with 18 out of 20 calculations finished over a weekend, and the 
other two dropped by
    the Linux system as swap space was exceeded - not nice, don't repeat.

    Cheers Franz



        The above will limit the jobs started, but it will not start more
        small jobs when the big jobs are finished.

        With multiple server of different sizes this becomes even harder.


        /Ole

            On 14-05-03 04:58 PM, Ole Tange wrote:

                On Wed, Apr 30, 2014 at 11:35 PM, B. Franz Lang 
<Franz.Lang@umontreal.ca
                <mailto:Franz.Lang@umontreal.ca>>
                wrote:

                    I have been trying to find a way that allows the use of 
'parallel'
                    without completely freezing machines --- which in my case
                    is due to the parallel execution
                    of very memory-hungry applications (like a server that has 
64 GB
                    memory, and one instance of an application - unforeseeable -
                    between 10-60 GB).

                I have spend quite some time trying to think of a good way to 
deal
                with that. But what is the correct thing to do?

                Let us assume that we have 64 GB RAM and that most jobs take 10 
GB but
                20% of the jobs take between 10-60 GB and that we cannot 
predict which
                jobs and we cannot predict how long they run.

                In theory 80% of the time we can run 6 jobs (namely the 10 GB 
jobs).

                How can we avoid that we start 2 jobs that will hit 60 GB at 
the same
                time (or 3 25 GB jobs)?

                If we can predict the memory usage, then the user can probably 
do that
                even better.

                niceload (part of the package) has --start-mem which will only 
start a
                new job if there is a certain amount of memory free. Which may 
help in
                some situations.

                But it does not solve the situation where the next 3 jobs are 
25 GB
                jobs and they start out looking as 10 GB jobs, thus you only 
discover
                that they are 25 GB jobs long after they started.

                So right now the problem is to find an algorithm that would do 
the
                right thing in most cases.

                If your program reaches its max memory usage fast then I would 
suggest
                you use 'ulimit' to kill off the jobs: That way you can run 6 
10 GB
                jobs at the time (killing jobs bigger than 10 GB). Using 
--joblog you
                can keep track of the jobs that got killed. When all the 10 GB 
jobs
                are complete, you can raise the ulimit and run 3 20 GB jobs with
                --resume-failed, then 2 30 GB jobs and finally the rest one job 
at a
                time.


                /Ole









reply via email to

[Prev in Thread] Current Thread [Next in Thread]