guix-science
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Conda environments and reproducibility


From: Konrad Hinsen
Subject: Re: Conda environments and reproducibility
Date: Tue, 29 Nov 2022 14:39:46 +0100

Hi Hugo,

 Buddelmeijer <hugo@buddelmeijer.nl> writes:

> Hi Konrad, Thibault and others,
>
> Konrad, is it perhaps possible for you to dig up this broken conda
> environment file?

Yes:

   https://gist.github.com/brospars/4671d9013f0d99e1c961482dab533c57

That environment was set up in 2018 on a Linux machine, and then tested
under macOS and Windows as well. It broke in early 2019.

> First, just like you all, my conclusion is that guix is the answer. The
> last two paragraphs by Simon captures it succinctly. However, conda seems
> to work fine for most people. It would therefore be instructive to have
> concrete 'failure stories' in order to show people that conda is not enough.

I have heard many stories of conda failing long-term, i.e. environments
not being reproducible after a year or two. Most use cases are probably
more short-term.

> It doesn't seem common to overwrite conda binaries. Conda takes some (not
> enough?) measures to prevent the scenario Konrad describes. In particular,
> the filenames include a 'hash' since conda 3 (~2014) [1]:

Weird. We worked with official Miniconda downloads from early 2018, and
our environment files contain no hashes.

> My realization was that improving these hashes is a goose chase and will
> ultimately lead to horrific things like "turing-complete yaml files". And
> at that point it is clear, at least to me, that guix is the answer.

Indeed. Turing-complete Scheme files :-)

My conclusion so far is that conda can never attain long-term
reproducibility, because it wants to be multi-platform. And that means
that it doesn't control the foundations on which it has to build.

>From a user's point of view, a big problem with conda is the opacity of
the machinery, which in addition changes all the time as you say. With
Guix, I can understand how everything is built, and thus understand the
potential obstacles to a rebuild many years later. With conda, I don't
really know and my understanding is that the build machinery is not
even completely public (for Anaconda at least).

> One thing that conda (or actualy conda-forge) does well, are their bots.
> I'm a maintainer of some conda packages and once a month or so I get a
> fully automated pull request to update my package [4], e.g. when the
> upstream package is updated, or when a dependency is updated. They even

That's nice!

> packages, such as compilers. This makes maintaining conda-forge packages a
> breeze. Having such bots also within the guix-ecosystem would probably help
> attract developers.

Indeed. More generally, I think package managers should do a better job
in reaching out to upstream maintainers. They are our allies in
providing a better UX.

Cheers,
  Konrad
-- 
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: konrad DOT hinsen AT cnrs DOT fr
http://dirac.cnrs-orleans.fr/~hinsen/
ORCID: https://orcid.org/0000-0003-0330-9428
Twitter: @khinsen
---------------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]