gwl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gwl-devel] Next steps for the GWL


From: Ricardo Wurmus
Subject: Re: [gwl-devel] Next steps for the GWL
Date: Thu, 6 Jun 2019 14:19:04 +0200
User-agent: mu4e 1.2.0; emacs 26.2

Hi simon,

> (+ Pjotr because I am sure he has an interesting opinion but not sure
> he closely reads this list ;-)
>
> On Mon, 3 Jun 2019 at 18:18, Ricardo Wurmus
> <address@hidden> wrote:
>
>> >  - what about a bridge with CWL?
>>
>> I’m open to this idea, but it would need to be well-defined.  What does
>> it really mean?  Generating CWL files from GWL workflows?  That really
>> shouldn’t be too hard.  Anything else, however, is hard for me to
>> imagine.
>
> Well, I point out previous threads about this topic:
>
> https://lists.gnu.org/archive/html/guix-devel/2018-01/msg00428.html
> https://lists.gnu.org/archive/html/gwl-devel/2019-02/msg00019.html
>
> 1-
> Generating CWL from GWL should be nice. It should ease the use of
> already in-place platform and tools  (AWS, etc.)

Generating CWL from GWL should be easy, but it’s also not all that
useful.  The GWL takes care of software deployment, so not only should
we generate CWL files but also generate (and upload?) Docker images and
make the CWL file reference them.

The tooling for CWL… seems a little less substantial and focused than it
first appears.  The cwltool can only run CWL workflows locally — no
DRMAA, no AWS.  All the other runners that are listed on the CWL website
are either very limited or very large environments where CWL execution
is not necessarily the primary purpose (cf Galaxy or Arvados).

Still, I think it’s the most meanigful connection the GWL can have with
the CWL: using the GWL as a high-level representation which “compiles”
down to a lower-level representation of CWL + Docker images when needed.

> 2-
> Use CWL as a process. A lot of work have been done by Pjotr and
> reported here [1]
>
>
> [1] 
> https://guix-hpc.bordeaux.inria.fr/blog/2019/01/creating-a-reproducible-workflow-with-cwl/

Yes, this works, of course, but that’s a level of integration that’s
extremely limited, in my opinion.  Using Guix with the CWL is fine as
the blog post demonstrates, but there is very little to be gained and
much to be lost when embedding CWL in a GWL workflow.  The only thing
this enables is reusing existing CWL workflows as a GWL “process”.
There is no meaningful integration – the embedded CWL workflow is a
second-class citizen that cannot benefit from any of the GWL features.

If the CWL workflow is connected to the GWL via cwltool then the only
way to run the workflow on a DRMAA-supported cluster or a bunch of
SSH-connected servers, or AWS EC2 instances is to wrap it up in a GWL
context.  The GWL treats the process as its smallest unit of
organisation, so a CWL workflow that’s run as a GWL process cannot
really be scaled.  If the user has a different CWL execution environment
(such as an Arvados installation), the CWL workflow embedded in the GWL
will not be able to make use of it.  It would forever be tied to the
particular version of cwltool in Guix.

I’d rather not advocate this use of the CWL in the GWL.  It might sound
good (“The GWL is compatible with the CWL!”), but ultimately it’s a
really awkward connection that is bound to lead to a great deal of
frustration.

Does this make sense?

I don’t want to be dismissive.  It would be great if we could come up
with something that’s mutually beneficial for CWL users and GWL users
alike, but I feel that our options are very limited.  I’m still open to
ideas and use-case scenarios.

--
Ricardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]