bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#65720: Guile-Git-managed checkouts grow way too much


From: Simon Tournier
Subject: bug#65720: Guile-Git-managed checkouts grow way too much
Date: Sat, 09 Sep 2023 12:31:48 +0200

Hi,

On Fri, 08 Sep 2023 at 19:09, Ludovic Courtès <ludo@gnu.org> wrote:

>>> It would also be pretty bad for closure size:
>>>
>>> --8<---------------cut here---------------start------------->8---
>>> $ guix size guile-git | tail -1
>>> total: 106.6 MiB
>>> $ guix size guile-git git-minimal | tail -1
>>> total: 169.8 MiB
>>> --8<---------------cut here---------------end--------------->8---
>>>
>>> It’s also not clear concretely how we’d add that dependency.  Try
>>> invoking ‘git’ from $PATH and print a warning if it doesn’t work?
>>> But then, what about applications like Cuirass and hpcguix-web?
>>
>> I think we can rely on something like,
>>
>>     guix shell -C git-minimal -- git gc
>
> We’re talking about the implementation of a cache (meant to speed up
> operations), that would actually fill said cache plus do a whole bunch
> of expensive operations?  Nah.  :-)

I do not think.  If I understand correctly, we need to run “git gc” at
some point, therefore git-minimal needs to me around.  The question is
how and when.

Well, maybe I am missing what the bug is about.  For me, it is about
running ‘git gc’ for cleaning the Git checkout cache, no?


Solution #1.  Add git-minimal as inputs.  It increases the closure and
the extra load (on average) is about the ratio between the rate of “guix
pull” and the rate of the git-minimal changes.

Assuming, that people are running “guix pull” once per week and say “git
gc” is run after 50 pulls.  (These both number are totally arbitrary and
based on my personal estimate).

Data Service [1] tells:

        2023-07-07 15:45:22 2023-09-08 21:22:08
        2023-05-11 16:10:48 2023-07-07 14:21:45
        2023-05-01 16:40:08 2023-05-11 14:36:16
        2023-04-25 13:34:54 2023-05-01 15:19:55
        2023-04-25 13:34:54 2023-09-08 21:22:08        
        2023-03-06 17:22:28 2023-04-25 12:27:33
        2023-01-17 23:49:19 2023-03-06 16:48:43
        2022-11-08 13:06:42 2023-01-17 15:11:47
        2022-10-08 05:14:46 2022-11-08 09:56:31
        2022-09-06 15:00:08 2022-10-08 04:15:43
        2022-08-13 22:02:31 2022-09-06 12:58:52
        …

It means that an user will download ~10 times git-minimal for nothing.


Solution #2.  The one I am proposing. :-)  Download git-minimal only
when Guix needs it for running “git gc”.  Yeah, there is probably a
small overload with some operations.  But, I bet this overload is much
smaller than the one of solution #1.

Well, it depends on the number of times people are updating the cache vs
the rate of change of git-minimal.

For sure, if one updates 100 times per week the cache, having
git-minimal as inputs is far better.  But I do not think that the
regular usage on average. :-)

That’s why I am proposing to have an option for turning off this “git
gc“ operation.

Well, we have lived since years without running ‘git gc’ so running it
once per year on average is probably enough to keep the cache size
reasonable.  And git-minimal is changing every month.


Maybe, there is some solution #3. ;-)

Cheers,
simon


1: 
https://data.guix.gnu.org/repository/1/branch/master/package/git-minimal/output-history





reply via email to

[Prev in Thread] Current Thread [Next in Thread]