bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#65456: [PATCH 0/2] Split guix build into more steps for 32bit hosts.


From: Janneke Nieuwenhuizen
Subject: bug#65456: [PATCH 0/2] Split guix build into more steps for 32bit hosts.
Date: Fri, 01 Sep 2023 14:48:38 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Ludovic Courtès writes:

Hello!

> Janneke Nieuwenhuizen <janneke@gnu.org> skribis:
>
>>>From ad94f06620e53fcc1495a2e2479dfc627177047c Mon Sep 17 00:00:00 2001
>> Message-ID: 
>> <ad94f06620e53fcc1495a2e2479dfc627177047c.1692783678.git.janneke@gnu.org>
>> From: Janneke Nieuwenhuizen <janneke@gnu.org>
>> Date: Thu, 22 Jun 2023 08:30:25 +0200
>> Subject: [PATCH v4] self: Build directories in chunks of max 25 files at a
>>  time.
>>
>> Similar to split build of make-go in Makefile.am, this breaks-up building
>> directories into chunks of max 25 files.  Also force garbage collection.
>
> The big difference with ‘make-go’ is that ‘make-go’ spawns a new process
> for each chunk of files: each process starts with an empty heap, which
> is not the case here as we reuse the same process.

Right.

> However, (guix self) is already splitting gnu/packages/*.scm in two
> pieces: ‘guix-packages-base’ and ‘guix-packages’.  The former is the
> closure of (gnu packages base), and the latter contains the remaining
> files.  Unfortunately this is uneven:

Okay...

> $ readlink -f $(type -P guix)
> /gnu/store/12p5axbr4gjrghlrqa4ikmhsxwq2wgw3-guix-command
> $ guix gc -R /gnu/store/12p5axbr4gjrghlrqa4ikmhsxwq2wgw3-guix-command|grep 
> packages-base
> /gnu/store/ivprgy9b2lv8wmkm10wkypf7k24cdifb-guix-packages-base
> /gnu/store/05pjlcfcfa0k9y833nnxxxjcn5mqr8zj-guix-packages-base-source
> /gnu/store/gnxjbyfwfmb216krz2x0cf1z5k1lla9x-guix-packages-base-modules
> $ find /gnu/store/ivprgy9b2lv8wmkm10wkypf7k24cdifb-guix-packages-base  -type 
> f |wc -l
> 361
> $ guix gc -R /gnu/store/12p5axbr4gjrghlrqa4ikmhsxwq2wgw3-guix-command|grep 
> packages$
> /gnu/store/8cda50hsayydrlw0qrhcy8q4dr9f1avx-guix-locale-guix-packages
> ludo@ribbon ~/src/guix [env]$ find 
> /gnu/store/8cda50hsayydrlw0qrhcy8q4dr9f1avx-guix-locale-guix-packages | wc -l
> 64
> $ guix describe
> Generation 271  Aug 20 2023 23:48:59    (current)
>   guix a0f5885
>     repository URL: https://git.savannah.gnu.org/git/guix.git
>     branch: master
>     commit: a0f5885fefd93a3859b6e4b82b18a6db9faeee05
>
> Maxime Devos looked into this a while back:
>
>   https://issues.guix.gnu.org/54539

Oh my....

>> * guix/self.scm (compiled-modules)[process-directory]: Split building of
>> directories into chunks of max 25 files.
>> +              (for-each
>> +               (lambda (chunck)
>
> s/chunck/chunk/

Oops, fixed.

> Can you confirm that this reduces memory usage observably?  One way to
> check that would be to print (gc-stats) from ‘process-directory’, with
> and without the change.  Could you give it a try?

What a good and seemingly simple question.  After a week of
instrumentation and testing, my answer can only be: I tried, and maybe.
(see below).

> Intuitively, I don’t see why it would eat less memory; maybe peak memory
> usage is lower because we do less at once?

Okay...

> Also, I think we should remove the explicit (gc) call: it should not be
> necessary, and if we depend on that, something’s wrong.

> Anyhow, thanks for tackling this issue!

Hehe.  You've probably seen Josselin's recent GraphML backend effort
that might really help to address this?  I'm afraid this patch can maybe
only postpone what really needs to be done...

There is gc-stats output from a successful `guix pull' or `make
as-derivation' on Guix/Hurd, that I can show you, and I've tried more
than 20 times; it always fails (OOM, hang, spontaneous reset, ...).

Below is a typical output of gc-stats on the Hurd for building self.scm,
when heap-size peaks (using the the max 25 files patch):

--8<---------------cut here---------------start------------->8---
((gc-time-taken . 1530)
 (heap-size . 2,625,474,560)
 (heap-free-size . 1127989248)
 (heap-total-allocated . 1337029496)
 (heap-allocated-since-gc . 28728)
 (protected-objects . 28)
 (gc-times . 324))
--8<---------------cut here---------------end--------------->8---

notice that it's *much* bigger (more than twice) than my findings on
linux-64 below.  I have no idea why this is of what it might mean...

So I turned to Guix GNU/Linux to get some gc-stat measurements.  What
you see below is the maximum head-size at any point (I also have
heap-total-allocated but I think that's irrelevant? and initially didn't
use a script that measured the time).

--8<---------------cut here---------------start------------->8---
* guix/self.scm: Vanilla, not chunked; print gc-stats.
((gc-time-taken . 27319485051)
 (heap-size . 1,360,330,752)
 (heap-free-size . 285,696,000)
 (heap-total-allocated . 74,067,590,944)
 (heap-allocated-since-gc . 186,250,144)
 (protected-objects . 28)
 (gc-times . 464))
real    24m36.643s

* guix/self.scm: Split building of directories into 26 chunks; print gc-stats.
 (heap-size . 1,131,298,816)

* guix/self.scm: Split building of directories into 26 chunks; no gc; print 
gc-stats.
 (heap-size . 1,121,116,160)

* guix/self.scm: Chunks of 25 files; run gc; print gc-stats.
 (heap-size . 1,066,725,376)

* guix/self.scm: Chunks of 50 files; no gc; print gc-stats.
 (heap-size . 1,299,230,720)
real    26m40.708s

* guix/self.scm: Chunks of 25 files; no gc; print gc-stats.
 (heap-size . 1,024,045,056)  ; 1st run
real    28m4.451s

* guix/self.scm: Chunks of 10 files; no gc; print gc-stats.
 (heap-size . 1,077,895,168)
real    30m14.049s
--8<---------------cut here---------------end--------------->8---

...strangely enough, if we assume that these statistics translate to the
Hurd, using chunks of max 25 files seems to be a sort of sweet spot?
25% less peak memory (~300MB), "only" 12% (3"45') slower...  though not
great for GNU/Linux users...

I have produced a handful of successful `guix pull's (from a local
checked-out worktree) using the 26-way split and chunks of max-25 files
patches, but sadly also many more attempts failed.  Initially, when
creating this patch series, I was convinced this fixed building on the
Hurd, but I'm much less enthusiastic now.

So I still have a slight preference for using the latest max-25-files
patch, but I'm sorry to say that I cannot back it up with tangible data.
All in all a rather discouraging week with much effort spent for little
gain.  Hopefully Josselin can do some of his magic here :)

Greetings,
Janneke

-- 
Janneke Nieuwenhuizen <janneke@gnu.org>  | GNU LilyPond https://LilyPond.org
Freelance IT https://www.JoyOfSource.com | Avatar® https://AvatarAcademy.com





reply via email to

[Prev in Thread] Current Thread [Next in Thread]