guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Where should we put machine learning model parameters ?


From: Kyle
Subject: Re: Where should we put machine learning model parameters ?
Date: Mon, 03 Apr 2023 19:12:01 +0000

My view as a statistician and Guix user is that trained machine learning models should at best be provided as substitutes. They are opaque binary artifacts of purely digital compilation processes and should not be treated exceptionally to any other build artifact.

It would seem to me most consistent with the goals of the project to insist on fully reproducible builds for machine learning models for them to be considered for inclusion into the main Guix distribution.

Full reproducibility would make the space requirements for including them even bigger than just the parameters but would ensure that the four freedoms could be preserved.



On April 3, 2023 12:48:12 PM EDT, "Nicolas Graves via Development of GNU Guix and the GNU System distribution." <guix-devel@gnu.org> wrote:

Hi Guix!

I've recently contributed a few tools that make a few OSS machine
learning programs usable for Guix, namely nerd-dictation for dictation
and llama-cpp as a converstional bot.

In the first case, I would also like to contribute parameters of some
localized models so that they can be used more easily through Guix. I've
already discussed this subject when submitting these patches, without a
clear answer.

In the case of nerd-dictation, the model parameters that can be used
are listed here : https://alphacephei.com/vosk/models

One caveat is that using all these models can take a lot of space on the
servers, a burden which is not useful because no build step are really
needed (except an unzip step). In this case, we can use the
#:substitutable? #f flag. You can find an example of some of these
packages right here :
https://git.sr.ht/~ngraves/dotfiles/tree/main/item/packages.scm

So my question is: Should we add this type of models in packages for
Guix? If yes, where should we put them? In machine-learning.scm? In a
new file machine-learning-models.scm (such a file would never need new
modules, and it might avoid some confusion between the tools and the
parameters needed to use the tools)?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]