[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Preparing for a new release
From: |
Ricardo Wurmus |
Subject: |
Re: Preparing for a new release |
Date: |
Sun, 09 Feb 2020 14:00:53 +0100 |
User-agent: |
mu4e 1.2.0; emacs 26.3 |
While playing with a real-world workflow I found a few problems:
* inputs and outputs are not validated
When a process declares that it produces an output, but then doesn’t do
that, the next process will fail with a nasty error message. This is
especially nasty when using containerization as the error is about
failing to map the input into the container.
Processes should automatically validate their inputs and outputs.
Since inputs and outputs could technically be something other than
files I’m not sure exactly how to do this.
@Roel: can you give an example of inputs / outputs that are not files?
I remember that you suggested that inputs might be database queries,
for example. I wonder if we should mark inputs and outputs with
types, so that the GWL can know if something is supposed to be a file
or something else. …just how would “something else” be used in a
process?
* The --output option has no effect
I think the “--output” option should cause all generated files to end
up somewhere in the given directory. I wonder if this should affect
*all* generated files or just the final output. If all outputs should
be affected then all *inputs* must be adjusted as well. Maybe
“--output” is the wrong name. Should it be “--prefix” instead?
* It’s not possible to select more than one tagged item
In my test workflow I’m generating a bunch of inputs by mapping over
an argument list. Now the problem is that I can’t select all of these
inputs easily in a code snippet. With the syntax we have I can only
select the first item following a tag.
To address this I’ve extended the accessor syntax, so this works now:
--8<---------------cut here---------------start------------->8---
process frobnicate
packages "frobnicator"
inputs
. genome: "hg19.fa"
. samples: "a" "b" "c"
outputs
. "result"
# {
frobnicate -g {{inputs:genome}} --files {{inputs::samples}} > {{outputs}}
}
--8<---------------cut here---------------end--------------->8---
Note how {{inputs::samples}} is substituted with “a b c”. With just a
single colon it would be just “a”. Single colon = single item; double
colon = more than one item.
* Containerization and directories
Containers for processes that create output files in directories that
don’t exist yet cannot be created. That’s because the “containerize”
procedure tries to map directories of input and output files into the
container — and the output directory doesn’t exist yet.
How should this be handled? We could ignore non-existing output
directories when creating containers, I suppose. I think that’s the
best option, because we can’t just create them lest we break
procedures that don’t deal well with existing directories.
--
Ricardo
- Preparing for a new release, Ricardo Wurmus, 2020/02/08
- Re: Preparing for a new release, Kyle Meyer, 2020/02/08
- Re: Preparing for a new release, Ricardo Wurmus, 2020/02/08
- Re: Preparing for a new release, Ricardo Wurmus, 2020/02/10
- Re: Preparing for a new release, zimoun, 2020/02/10
- Re: Preparing for a new release, Ricardo Wurmus, 2020/02/10
- Re: Preparing for a new release, zimoun, 2020/02/10
- Re: Preparing for a new release, Ricardo Wurmus, 2020/02/11
- Re: Preparing for a new release, Ricardo Wurmus, 2020/02/11
- Re: Preparing for a new release, zimoun, 2020/02/11