bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) fai


From: Christopher Baines
Subject: bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed
Date: Thu, 16 Apr 2020 20:24:15 +0100
User-agent: mu4e 1.2.0; emacs 26.3

Christopher Baines <address@hidden> writes:

> Ludovic Courtès <address@hidden> writes:
>
>> Hi Christopher,
>>
>> Christopher Baines <address@hidden> skribis:
>>
>>> I've attached a script that when run should reproduce the issue. I
>>> extracted the code relating to lint warnings from the Guix Data
>>> Service. The script attached runs this code twice against the inferior,
>>> once will often be enough to cause it to crash, but twice should
>>> reproduce it more reliably.
>>
>> Thanks a lot.
>>
>> Here’s a backtrace from the core dumped by the inferior:
>
> ...
>
>> It could be an unbounded growth of libgc’s finalizer table or our weak
>> tables as we experienced in <https://bugs.gnu.org/28590>.
>>
>> We should be able to reproduce it with something like:
>>
>>   guix time-machine --commit=d523eb5c9c2659cbbaf4eeef3691234ae527ee6a -- \
>>     lint -c 
>> inputs-should-be-native,license,mirror-url,source-file-name,source-unstable-tarball,derivation,patch-file-names,formatting,synopsis
>>
>> In top one can see that heap usage keeps growing, which may well be a
>> bug in Guix proper rather than in Guile… but it doesn’t crash.
>>
>> I would propose three actions here:
>>
>>   1. Run linters un ‘gcprof’ to see what’s eating memory and hopefully
>>      find and address the leak.  As a start, maybe just start reducing
>>      the list of checkers to see if there’s one of them that’s causing
>>      it.
>>
>>      The ‘derivation’ checker is definitely responsible for a lot of the
>>      heap consumption because of the various caches in (guix packages) &
>>      co.  Perhaps add calls to ‘invalidate-derivation-caches!’ as in
>>      (gnu ci).
>>
>>   2. Work around the problem in Guix Data Service by running, say, one
>>      inferior per checker instead of one inferior for all checkers for
>>      all packages.
>>
>>   3. If #1 didn’t help, let’s see if we can isolate a Guile weak-table
>>      bug or something like that.
>>
>> Thoughts?
>
> Thanks, that's useful to know.
>
> I think I've now managed to find a way of reproducing this without the
> inferior getting in the way. I was testing if triggering garbage
> collection in Guile would help avoid the problem, but actually it seems
> to cause it. I guess given the mentions of GC in the above stacktrace,
> and the major version change of libgc, some GC related bug seems quite
> likely here.
>
> I've been testing with a checkout of Guix built with Guix from the
> core-updates branch. I think that provides the same broken Guile that
> the guix repl is using.
>
> When trying to just use a checkout of the core-updates branch, and guile
> built from that branch I get the following odd error:
>
> → ./pre-inst-env 
> /gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/bin/guile 
> ./reproduce-core-updates-mmap-PROT_NONE-failed.scm
> guile: warning: failed to install locale
> warning: failed to load '(gnu packages abiword)': Function not implemented
> error: git-fetch: unbound variable
> hint: Did you forget `(use-modules (guix git-download))'?
>
> error: git-version: unbound variable
>
>
>
> No idea what's happening there, but when I ./configure and make with
> packages from core-updates, I seem to end up with a setup that works:
>
> This is the guile I'm using: 
> /gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/bin/guile
>
> If you just run the script, you should see:
>
> → ./pre-inst-env guile ./reproduce-core-updates-mmap-PROT_NONE-failed.scm
>
> ;;; ("%package-table-setup" #<hash-table 7f5f329278a0 13275/28099>)
> mmap(PROT_NONE) failed
> Aborted
>
>
> For more information, you can pipe the script to the REPL. What you
> should see is that it's slow to compute the lint warnings the first
> time, but the subsequent times are quick, and it crashes in one of the
> (gc) calls.
>
> I'm going to try and continue looking in to this, at least it'll be
> easier to delve in to guile now that I can directly control what guile
> is used.

Following up on this, I've built Guile on core-updates with libgc@7
rather than libgc@8 (which is what's used above), and I can't reproduce
the issue. So, I'm getting more certain that this is a regression which
the libgc upgrade has led to.

Would it be feasible to keep guile, or at least the guile Guix uses with
libgc@7 for now?

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]