parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parallel + blast + LSF


From: Giuseppe Aprea
Subject: Re: parallel + blast + LSF
Date: Wed, 6 May 2015 13:50:34 +0200

Hi and thank you for your reply.
Forgive me if I prefer non-interleaved replies.

-j option: Yes, the suggested man page replacement is clearer for me. I would even stress the concept:

       -P N     Number of jobslots on each machine. In general, used with multiple machines, run up to N*Nmachines jobs in parallel.  0 means as many as possible (on each machine). Default is one job per CPU (on each machine). It is overwritten by ncpu (see --sshlogin)

################################################################################################
################################################################################################

limiting max ncpu per host: Exactly! I sort of lie about the number of cores. Usually you don't know how many slots per host you are going to be given before job submission so you have to do some parsing of a special file given by the queue system after submission and prepare the server file. I cant' use the -j option since I may receive 100% on some hosts, 75% on some others, etc.. I guess there must be a single -j option, right?

################################################################################################
################################################################################################

 --wait/--semaphore/--pipe options: the --semaphore man page says:


--semaphore

Work as a counting semaphore. --semaphore will cause GNU parallel to start command in the background. When the number of simultaneous jobs is reached, GNU parallel will wait for one of these to complete before starting another command.

--semaphore implies --bg unless --fg is specified.

--semaphore implies --semaphorename `tty` unless --semaphorename is specified.

Used with --fg, --wait, and --semaphorename.

The command sem is an alias for parallel --semaphore.

See also man sem.


It may be my poor English but when I read that I understood that only using this option GNU parallel, once the maximum number of jobs (allowed by -j, for example) was reached, would have waited for one job to complete before starting another one. Now I have also read the Tutorial and I am confused (but it could also be a problem I have with background jobs and the symbol "&&"). According to what you say, 

- if I am running on a single host with "-j 10" and without --semaphore  when 10 simultaneous jobs are reached, GNU parallel will wait for one to complete before starting another one because "that is what GNU Parallel does normally"

- if I am running on a single host with "-j 10" and with --semaphore  when 10 simultaneous jobs are reached, GNU parallel will wait for one to complete before starting another one because that is what is written in the manpage

Most likely then, the only difference is in --semaphore causing GNU parallel to start command in the background. If that is the case, I find it a bit strange that --semaphore can be used with --fg.

Anyhow, I see that --semaphore in my situation is useless because I used it only to limit the maximum number of simultaneous jobs but "that is what GNU Parallel does normally"; moreover I don't need to send my jobs in background. I am not sure you meant (only) that when you said it doesn't make sense.

################################################################################################
################################################################################################
################################################################################################
################################################################################################

All that said, I tried to simplify my problem. now I'd like to run the script:

------------------------------------------------------------
#!/bin/bash
module load 4.8.3/parallel/20150422
cat goodProteins.fasta | parallel -j 24 --no-notice --tmpdir tmp --block 200k --recstart '>' --pipe awk \'{print \$0}\' - \> result_{#}
------------------------------------------------------------

which locally works nicely giving 338 result files.

Unfortunately, when I try to run it on remote hosts I do not get any output. Here is the script:

------------------------------------------------------------
#!/bin/bash
module load 4.8.3/parallel/20150422
cat goodProteins.fasta | parallel -vv -j 24 --no-notice -vv --tmpdir tmp --slf servers --block 200k --recstart '>' --pipe awk \'{print $0}\' - \> result_{#}
------------------------------------------------------------
 
this is the server file:

------------------------------------------------------------
cresco3x032.portici.enea.it
cresco3x046.portici.enea.it
cresco3x057.portici.enea.it
------------------------------------------------------------

and here is stderr:

------------------------------------------------------------
parallel: Warning: ssh to cresco3x046.portici.enea.it only allows for 0 simultaneous logins.
You may raise this by changing /etc/ssh/sshd_config:MaxStartups and MaxSessions on cresco3x046.portici.enea.it.
Using only -1 connections to avoid race conditions.
parallel: Warning: ssh to cresco3x057.portici.enea.it only allows for 0 simultaneous logins.
You may raise this by changing /etc/ssh/sshd_config:MaxStartups and MaxSessions on cresco3x057.portici.enea.it.
Using only -1 connections to avoid race conditions.
parallel: Warning: ssh to cresco3x032.portici.enea.it only allows for 0 simultaneous logins.
You may raise this by changing /etc/ssh/sshd_config:MaxStartups and MaxSessions on cresco3x032.portici.enea.it.
Using only -1 connections to avoid race conditions.
------------------------------------------------------------


I also tried using --controlmaster and then --sshdelay 0.5 but, again, no output. This time I am working with version 20150422.

Any Idea?


g



On Tue, May 5, 2015 at 5:14 PM, Ole Tange <ole@tange.dk> wrote:
On Tue, May 5, 2015 at 10:32 AM, Giuseppe Aprea
<giuseppe.aprea@gmail.com> wrote:
> Hi and thank you for your reply.
>
> -j issue: I used "-j 192" since 192 is the sum of all the slots the queue
> system allocates on the different hosts. Reading again the manual I see why,
> given my options and my server file, GNU parallel could run 192 jobs on the
> same host. Anyway, in my opinion, this point isn't really clear. An user
> could also get the idea that -j is for the total cores which get divided
> among the different hosts as specified by the ncpus in the server file. At
> least, I expected that.

Please help by rephrasing the man page, so you would have understood
the concept the first time. Would this help:

       -P N     Number of jobslots on each machine. Run up to N jobs
in parallel.  0 means as many as possible. Default is 100% which will
run one job per CPU core on the machine.

> --wait: I am running on a shared cluster; that means the queue system may
> give me 8 slots an a 16-cores host. The other 8 slots could be used by a
> different user at the same time. Resources fair share implies that I don't
> run more than 8 simultaneous blastp instances on that host.

Which means that for that particular host you have to lie about the
number of cores (e.g. by using the ncpu/host syntax).

If you in general only want to use 50% of the cores, then -j 50% will
do the trick.

> That is why,
> when 8 simultaneous blastp are reached, I want GNU parallel to wait for one
> of these to complete before starting another one.

And that is what GNU Parallel does normally. No extra options needed.

> That is what I expect from
> "--semaphore"; I used --wait for that and to be sure the queue system waited
> for all background gnu parallel jobs to be completed before considering the
> whole job finished. Does that make sense now?

I assume that you have understood, that --pipe makes GNU Parallel
behave very differently from not having --pipe. You could even argue
that 'parallel --pipe' would justify being a command on its own.

Have you understood that --semaphore also makes GNU Parallel behave
very differently from both --pipe and no --pipe? (Which is why it does
have its own alias, namely 'sem').

And have you understood why it does not make sense to use --semaphore
in your situation?

If yes: Please help by rephrasing the man page, so you would have
understood it in the first read.

If no: Please walk through the tutorial section on Semaphore.


/Ole


reply via email to

[Prev in Thread] Current Thread [Next in Thread]