Re: parallel + blast + LSF

On Wed, Apr 15, 2015 at 11:27 PM, George Marselis <george@marsel.is> wrote:

Giuseppe, I was referring to both of you. My apologies I was not clear, I had my head stuck in Perl while writing the first email.

My suggestion to both of you is that you should not use parallel for your respective topics.

Giuseppe,

You should use an extra script. Your problem is that you are timing out while trying to submit all those jobs. The timeout happens because of the number of jobs you are submitting: LSF cannot write the job descriptions fast enough to disk, times out because the action is not completed and then stays in that state

----------------
Martin,

You could use parallel to submit jobs, but its a very bad idea, due to the limitations of the software. Use batch scripts and job arrays when possible.

----------------

So, as per my suggestion, I think our discussion is offtopic for this list. We could continue here, if Ole and the list puts up with us, but I think we should take this on a personal email or switch this to the Debian Medical email list https://en.wikipedia.org/wiki/Debian-Med .

Let me know which option is better for you.

As with regard to Martin, he should not use parallel for

Ciao,

George

On Wed, Apr 15, 2015 at 10:50 PM, Giuseppe Aprea <giuseppe.aprea@gmail.com> wrote:
Hi George!

I am not sure who you are talking with. Martin or me? I remind the original topic is about using blast under parallel with LSF.
Martin's problem sounds like something offtopic.

You have both sysadmin and bioinformatics experience so I would really appreciate your help!

I am working on a cluster so I must use LSF to get slots and I would prefer using parallel also since it splits input automatically with --recstart (which is quite nice:D otherwise I have to use another script for that). I see I could do better with chunksize (I have 1 record at time in my example) but that's a secondary problem now. First I have the "lsb_launch(): Failed while waiting for tasks to finish." issue to solve.

cheers,

g

On Wed, Apr 15, 2015 at 7:44 PM, George Marselis <george@marsel.is> wrote:
By the way, LSF and GNU parallel do almost the same thing. So using one of the two, defeats the purpose of using the other.

In the same way, you could have used LSF to submit your jobs to LSF:

bsub < script.sh

where script.sh was

bsub -J amoeba -q smalljobs qfasta file1
bsub -J amoeba -q smalljobs qfasta file2
...
bsub -J amoeba -q smalljobs qfasta file2000

On Wed, Apr 15, 2015 at 8:39 PM, George Marselis <george@marsel.is> wrote:
Hi. LSF/Openlava sysadmin in bioinformatics and parallel user here.

I have seen this a couple more times: You are trying to use GNU parallel to submit the jobs to all nodes.

THat's now the way to do things: You should not submit jobs on *all* your nodes. Please don't do that, as bsub was not designed to read large chunks of jobs. bsub writes the jobs to your home directory, so if your storage is not designed for a lot of writes, you are going to blow the cluster out of the water.

What you want to do is look up either:

1. bsub scripts https://rc.fas.harvard.edu/resources/documentation/legacy-lsf/lsf-submit-an-lsf-job/

or

2. job arrays https://rc.fas.harvard.edu/resources/documentation/legacy-lsf/lsf-submitting-lots-of-short-jobs-job-arrays/

Both bsub scripts and job arrays are useful to you: bsub scripts can be submitted as part of a pipeline: you can program the output of the bsub script from your pipeline and then submit it to bsub. So, instead of submitting your job 2000 times as in

bsub job0
bsub job1

....

bsub job1999

you just submit "bsub < scriptname" which contains 2000 lines which describe your jobs and you are done. The rest is done by bsub/LSF

Now, if your jobs are similar in a way that you just increment counter (as in most bioinformatics jobs), use arrays.

bsub -J JOBNAME[0-1999], where JOBNAME is a string you would like to name your job as, eg "fasta files alignment"

These techniques are useful because you can submit all 2000 jobs in less than a second, you can do it from a single node and you will not have to deal with a grumpy sysadmin or grumpy colleagues who cannot use the cluster. Just make sure you use the appropriate queue.

Let me know if you have any questions.

Best Regards,

George Marselis

On Wed, Apr 15, 2015 at 6:48 PM, Martin d'Anjou <martin.danjou14@gmail.com> wrote:

Hi,

Thanks for clarifying. I want to use GNU Parallel to bsub jobs. This way I can use GNU Parallel to throttle the number of jobs that are submitted to LSF, and it is easier than writing a loop.

parallel -j 100 my_script [bsub options] ::: {1..2000}

my_script (pseudo-code):
#!/bin/bash
...
bsub [bsub options] command ...
post-process data

This way I can submit jobs, say 100 at a time. When I submit all 2000 jobs, it gets problematic and I start hitting limits with file descriptors, etc.

Thanks for sharing,
Martin

On 15-04-15 11:35 AM, Giuseppe Aprea wrote:

Hi Martin,

I am not sure I understand. As far as I can see, things work exactly the opposite way: you have an LSF script which launches GNU Parallel on some hosts provided by LSF. Something like:

-------------------------------------------------------------------------------

-------------------------------------------------------------------------------

#!/bin/bash

#BSUB -J gnuParallel_blast_test # Name of the job.

#BSUB -o %J.out # Appends std output to file %J.out. (%J is the Job ID)

#BSUB -e %J.err # Appends std error to file %J.err.

#BSUB -q large # Queue name.

#BSUB -n 30 # Number of CPUs.

module load 4.8.3/ncbi/12.0.0

module load 4.8.3/parallel/20150122

SLOTS=`cat ${LSB_DJOB_HOSTFILE} |wc -l`

SERVER=""

for i in `cat ${LSB_DJOB_HOSTFILE}| sort`

do

echo "/afs/enea.it/software/bin/blaunch.sh ${i}" >> servers

done

cat absolute_path_to_sequences.fasta | parallel --no-notice -vv -j ${SLOTS} --slf servers --plain --recstart '>' -N 1 --pipe blastp -evalue 1e-05 -outfmt 6 -db absolute_path_to_db_file -query - -out absolute_path_to_result_file_{%}

-------------------------------------------------------------------------------

-------------------------------------------------------------------------------

LSF is the one which gives you the execution hosts so if you are launching bsub from GNU parallel how do you know how to set the --slf option?

g

On Wed, Apr 15, 2015 at 4:24 PM, Martin d'Anjou <martin.danjou14@gmail.com> wrote:

On 15-04-15 09:34 AM, Giuseppe Aprea wrote:

Hi all,

I would like to ask you, please, some help in using parallel with blast alignment software.

I am trying to use GNU parallel v. 20150122 with blast for a very large sequences alignment. I am using Parallel on a cluster which uses LSF as queue system.

Hello Giuseppe,

I am an avid LSF user, and I want to use GNU Parallel to dispatch jobs to LSF. Could you please explain a little bit to me how GNU Parallel works with LSF? I do not see it in the on-line tutorials. For example, I would like to understand how to pass "bsub" options like -oo, -q queue_name, etc. to LSF from GNU Parallel.

Thanks,
Martin

From:	George Marselis
Subject:	Re: parallel + blast + LSF
Date:	Wed, 15 Apr 2015 23:32:48 +0300