guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Google Summer of Code 2023 Inquiry


From: Spencer Skylar Chan
Subject: Re: Google Summer of Code 2023 Inquiry
Date: Mon, 3 Apr 2023 20:41:53 -0400

Hi Kyle,

On 3/31/23 11:15, Kyle wrote:
I would expect most software versions to not be in Guix. Simon had mentioned 
that this is mostly what the guix-past repository is for. However, some 
packages might be buried on some branch or some commit in some Guix related git 
repository. It may be helpful to facilitate their discovery and extraction for 
conda import.

Git has a newish binary file format for caching searches across commits. Maybe 
it would be helpful to figure out how to parse this format (its documented) and 
index the data further using Xapian or a graph data structure (or tree sitter?) 
with the relevant metadata needed to find and efficiently extract scheme code 
and its dependencies?

If the format is documented then this is possible, although I'm not super familiar with these kinds of data structures.

You make an interesting point about compilation errors. It may more productive to help 
researchers test for working satisfiable configurations as a more relaxed approach to 
having to specify the exact software version. Maybe some "nearby" or newer 
version is packaged and that is enough to successfully run a test suite? I'm imagining 
something between git bisect and Guix's own package solver.

Yes, we could have a variant of the solver that's more relaxed. It could output multiple solutions so the user can inspect them and pick the best one.

It might also be productive to add infrastructure to help scientists more 
conveniently track and study their recent packaging experiments. Guix will only 
become more useful the more packages which are already available. Work which 
makes packaging more approachable by more people benefits everyone. Perhaps you 
can think of other ideas in this direction?

I'm not sure how "packaging experiments" are different from packaging software the usual way. I think making the importers easier to use and debug would help, although that sounds outside the scope of the projects.

Finally, would these projects be considered large or medium for the purposes of GSOC?

Thanks,
Skylar

On March 30, 2023 7:22:14 PM EDT, Spencer Skylar Chan 
<schan12@terpmail.umd.edu> wrote:
Hi Kyle,

On 3/24/23 14:59, Kyle wrote:
I am a bit worried about your proposed project is too focused on replacing 
python with guile. I think the project would benefit more from making python 
users more comfortable productively using Guix tools in concert with the tools 
they are already comfortable with.

Yes, I agree with you. Replacing Python with Guile is a much more ambitious 
task and is not the highest priority here.

I'm wondering if you might consider modifying your project goals toward 
exploring how GWL might be enhanced so that it could better complement more 
expressive language specific workflow tools like snakemake. I am also 
personally interested in exploring such a facilities from the targets workflow 
system in R as well. Alternatively, perhaps you could focus kn extending the 
GWL with more features?

I would also be interested in extending GWL with more features, I will follow 
up with this on the GWL mailing list.

I agree that establishing an achievable scope within a short timeline is 
crucial. The conda env importer idea would be quite an ambitious undertaking by 
itself and would lead you towards thinking about some pretty interesting and 
impactful problems.

While it's a challenging project, it could be broken into smaller steps:

1. import packages by exact matching names only, without versioning.
2. extend `guix import` to have `guix import conda` to help with package names 
that do not match exactly, and to accelerate adoption of Conda packages not in 
Guix
3. match software version numbers when translating Conda packages to Guix

What's currently undefined is the error handling:
- if a Conda package does not exist in Guix
- if the dependency graph is not solvable
- if compiling the environment fails (due to mismatching dependency versions)

I believe there are many satisfactory stopping points for successful completion 
within the timeline of the summer, which I hope to present with my proposal 
soon.

Thanks,
Skylar


On March 22, 2023 5:44:52 PM EDT, Spencer Skylar Chan 
<schan12@terpmail.umd.edu> wrote:

     Hi Ricardo,

     On 3/22/23 14:19, Ricardo Wurmus wrote:


                 - Translating Snakemake to Guix Workflow Language (GWL)


             Ricardo, maybe you would have some suggestions. :-)


         Oh, this looks interesting. Could you please elaborate on the idea?

     My idea is to take as input a Snakemake workflow file and eventually 
output an equivalent GWL workflow file.

     Currently, Snakemake workflows can be exported to CWL (Common Workflow 
Language):

     https://snakemake.readthedocs.io/en/stable/executing/interoperability.html  
<https://snakemake.readthedocs.io/en/stable/executing/interoperability.html>

     One approach could be to add CWL import/export capabilities to GWL. Then 
Snakemake/GWL conversion would be a 2 step process, using CWL as an 
intermediate step:

     1. Snakemake -> CWL
     2. CWL -> GWL

     However, CWL is not as expressive as Snakemake. There may be some details 
that are lost from Snakemake workflows.

     So a 1-step Snakemake/GWL transpiler could be interesting, as both Snakemake/GWL use 
a domain-specific language inside a general purpose language (Python/Guile respectively). 
There may be a possibility to achieve more "accurate" translations between 
workflows.

     Is this topic something that could fit into a summer project?







reply via email to

[Prev in Thread] Current Thread [Next in Thread]