parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parallel + blast + LSF


From: Ole Tange
Subject: Re: parallel + blast + LSF
Date: Wed, 6 May 2015 23:59:39 +0200

On Wed, May 6, 2015 at 1:50 PM, Giuseppe Aprea <giuseppe.aprea@gmail.com> wrote:

>  --wait/--semaphore/--pipe options: the --semaphore man page says:
>
> --semaphore
>
> Work as a counting semaphore. --semaphore will cause GNU parallel to start
> command in the background. When the number of simultaneous jobs is reached,
> GNU parallel will wait for one of these to complete before starting another
> command.
>
> --semaphore implies --bg unless --fg is specified.
>
> --semaphore implies --semaphorename `tty` unless --semaphorename is
> specified.
>
> Used with --fg, --wait, and --semaphorename.
>
> The command sem is an alias for parallel --semaphore.
>
> See also man sem.
>
>
> It may be my poor English but when I read that I understood that only using
> this option GNU parallel, once the maximum number of jobs (allowed by -j,
> for example) was reached, would have waited for one job to complete before
> starting another one. Now I have also read the Tutorial and I am confused
> (but it could also be a problem I have with background jobs and the symbol
> "&&"). According to what you say,
>
> - if I am running on a single host with "-j 10" and without --semaphore
> when 10 simultaneous jobs are reached, GNU parallel will wait for one to
> complete before starting another one because "that is what GNU Parallel does
> normally"
>
> - if I am running on a single host with "-j 10" and with --semaphore  when
> 10 simultaneous jobs are reached, GNU parallel will wait for one to complete
> before starting another one because that is what is written in the manpage

Yeah, I can see this is tricky if you have not been educated in what a
semaphore is in computer science. Maybe this explanation helps.

A counting semaphore (which 'sem' implements) is like a bunch of
toilets: People needing a toilet can use any toilet, but if there are
more people than toilets, they will have to wait for one of the
toilets to be available.

-j sets the number of toilets. Calling 'sem' is putting one person in
the queue for the toilets: If there is a toilet available, it starts
the job in the background and exits immediately. So 'sem' follows the
person to the toilet, but it does not go into the toilet with the
person.

If all toilets are taken, it waits until a toilet is free.

'sem --fg' starts the job in the foreground and only exits when the
job is done, so it stays with the person until the person leaves the
toilet.

So where 'parallel' usually will start more than one job, 'sem' only
starts a single job, and will often sit waiting before starting the
job.

A special type of semaphore is a mutex. That just a semaphore with a
single toilet. This is useful for a single shared resource, so that
two programs do not use this single shared resource at the same time.
'sem' defaults to '-j1'.

Is it now clear that 'sem' is not what you are looking for?

> and here is stderr:
>
> ------------------------------------------------------------
> parallel: Warning: ssh to cresco3x046.portici.enea.it only allows for 0
> simultaneous logins.
> You may raise this by changing /etc/ssh/sshd_config:MaxStartups and
> MaxSessions on cresco3x046.portici.enea.it.
> Using only -1 connections to avoid race conditions.

So GNU Parallel fails to run anything on the remote machine. Try a
simpler example and then debug that.

I would try:

  ssh cresco3x046.portici.enea.it echo 1

If that works:

  parallel -S cresco3x046.portici.enea.it echo ::: 1

If that fails:

  parallel -vvS cresco3x046.portici.enea.it echo ::: 1


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]