guix-science
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Conda environments and reproducibility


From: Simon Tournier
Subject: Re: Conda environments and reproducibility
Date: Mon, 28 Nov 2022 21:46:05 +0100

Hi,

On Mon, 28 Nov 2022 at 17:28, Thibault Lestang <t.lestang@imperial.ac.uk> wrote:
> -----
> @luispedrocoelho
> Me, 6 months ago: I am going to save this conda
> environment with all the versions of all the packages so it can be
> recreated later; this is Reproducible Science!
>
> conda, today: these versions don't work together, lol.
> -----
>
> I simply can't explain how such a behavior can happen.

One thing is the link rot.  I do not know if it is currently estimated,
but for sure, we always underestimate it.

> I understand that conda ships pre-compiled binaries. I see how that's
> bad for reproducibility and provenance tracking since it's not
> straightforward to know how these binaries and dependencies were
> compiled. I'm assuming that, when conda saves an environment, it records
> version tags and "everything else required" to pull the same binaries
> later. Okay - I see how binaries could /technically/ be modified at a
> later stage whilst maintaning the same version tag (provenance tracking
> issue).

Aside, you are assuming the availability of such binaries. :-)

Another thing, from the old time where I used Conda, and I may be wrong,
is, I guess , the SAT solver [1].  Well, 6 months ago, you described
your environment, for instance saying:

    1.0 <= foo
    2.0 <= bar <= 3.0
    baz <= 4.0

then foo@1.1, foo@1.2 and foo@2.0 had been released in these past 6
months.  But baz <= 4.0 only works with 0.9 <= foo <= 1.2 and the
constraint on bar implies other constraints on foo and/or baz.

The complexity about SAT solvers is exponential, IIRC, for sure really
bad, and I do not know the state-of-the-art but I guess the problem to
solve is going to be worse and worse as the time flies.

>From my experience, you have only one solution to fight against the
time: freeze.  The question is then how or what to freeze. :-)

One way for freezing is the binary container.  Another way for freezing
is to have a “summary” capturing the whole (fixed) graph of
dependencies.  This is (usually named) the channels.scm file (guix
describe).  Then, the assumptions become:

 1. solve the link rot; tackled by Software Heritage,
 2. Linux kernel API backward compatibility,
 3. hardware compatibility,

to be able to rebuild.  If I might, here some stuff: :-)

https://www.nature.com/articles/s41597-022-01720-9
https://simon.tournier.info/posts/2022-11-08-bluehats.html
https://simon.tournier.info/posts/2022-04-15-cafe-guix-long-term.html


Cheers,
simon

1: https://en.wikipedia.org/wiki/SAT_solver



reply via email to

[Prev in Thread] Current Thread [Next in Thread]