bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#54447: cuirass: missing derivation error


From: Maxim Cournoyer
Subject: bug#54447: cuirass: missing derivation error
Date: Tue, 10 Oct 2023 23:08:12 -0400
User-agent: Gnus/5.13 (Gnus v5.13)

Hi Ludovic,

Ludovic Courtès <ludo@gnu.org> writes:

> Hello!
>
> Mathieu Othacehe <othacehe@gnu.org> skribis:
>
>> A lot of builds, among them ~20 system tests[1], are failing with:
>> "cannot build missing derivation
>> ?/gnu/store/hs6kp1lqgymhyp3jndc0dsp0pn4psgv0-gui-installed-desktop-os-encrypted.drv?"
>> errors.
>
> I have a disappointingly simple hypothesis for this.  Remember that
> “missing derivation” errors happen primarily for system tests.
>
> Turns out that ‘cleanup-cuirass-roots’ in maintenance.git, used as an
> mcron job, explicitly removes GC roots for things like *-os-encrypted
> once they’re more than two days old, as well as GC roots for the
> corresponding .drv.
>
> I think this was increasing the likelihood that a .drv would be GC’d by
> the time we run the test: under high load¹, it’s plausible that a system
> test wouldn’t be built within two days after it’s been queued.
>
> I’m proposing the change below to address this; I don’t think we need
> ‘--gc-keep-outputs --gc-keep-derivations’ anymore now that we keep
> things in ‘guix publish’ cache first and foremost.
>
> Thoughts?

Ah, so that mcron job is kind of a hack to hasten garbage collecting
only *some* items faster than the default policy of 30 days?  And we'd
now avoid deleting selected .drv files while still deleting their
outputs, so in the case something that needs it took more than 2 days to
build, it could lead to having to rebuild the garbage collected outputs?

I'm not sure if we need such a fancy hack with the 100 TiB of data we
now have, but your fix seems reasonable (LGTM!)

> In addition to the mcron job, Cuirass’s own ‘register-gc-roots’
> procedure periodically deletes GC roots older than ‘%gc-roots-ttl’ (30
> days in practice).  That’s okay, except that it would be safer to delete
> GC roots for a .drv if and only if it’s been built already.

Hm.  I wonder if this could explain the other cases we've seen.  It
could be that building a derivation was interrupted or canceled for some
reason, then 30 days elapsed, then was garbage collected, and after
which it doesn't get recreated and we get the error of the missing .drv?

-- 
Thanks,
Maxim





reply via email to

[Prev in Thread] Current Thread [Next in Thread]