bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#65391: Acknowledgement (People need to report failing builds even th


From: Dr. Arne Babenhauserheide
Subject: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)
Date: Mon, 11 Sep 2023 10:30:43 +0200
User-agent: mu4e 1.10.5; emacs 29.0.92

Simon Tournier <zimon.toutoune@gmail.com> writes:
> On Wed, 30 Aug 2023 at 12:39, "Dr. Arne Babenhauserheide" <arne_bab@web.de> 
> wrote:
>> Please don’t remove packages that are broken on the CI. I often had a
>> case where no substitute was available but the package built just fine
>> locally. This is not a perfect situation (nicer would be to track why it
>> doesn’t come from CI — sometimes it’s just a resource problem on the
>> CI), but if you removed a package I use that would break all updates for
>> me.
>
> Well, I do not think that any policy will mark a package for removal on
> the first build failure.  However, if the same package is still failing
> after several X <duration> or attempts, it means something is wrong.
> Marking it as a candidate for removal implies:
>
>  1. check if the failure is from CI when it builds locally,
>  2. keep a set of packages that we know they are installable.
>
> For instance, ocaml4.07-* packages are failing since more or less April.
>
> https://data.guix.gnu.org/repository/1/branch/master/package/ocaml4.07-ppxlib/output-history
>
> Does it make sense to keep them?  For another example, some perl6-*
> packages are failing since… 2021.
>
> https://data.guix.gnu.org/repository/1/branch/master/package/perl6-xml-writer/output-history
>
> Does it make sense to keep them?

This is a good example, but not for removing broken packages. For
perl6-xml-writer removing the package would keep breakage in Guix.

I just checked the build, and this looks like a Guix packaging error
that breaks the tests due to a change to some unrelated package:
/gnu/store/ap404x14l604wm0gvaj439ga2vjzwnl7-perl6-tap-harness-0.0.7/bin/prove6: 
/gnu/store/ap404x14l604wm0gvaj439ga2vjzwnl7-perl6-tap-harness-0.0.7/bin/.prove6-real:
 perl6: bad interpreter: No such file or directory

Disabling the tests makes the package build and work.

So here, removing a package would start at the wrong place: some change
between 2021-02-01 and 2021-04-30 broke the perl6-tap-harness and we did
not detect that.

This is a problem that would get hidden by removing broken packages.

The problem is that we (large inclusive we that stands for all users of
Guix) did not track down this problem that causes the build to fail.

From this I see two distinct cases:

- packages broken upstream
- packages broken by changes in Guix

If a package is broken upstream and not going to get fixed and this
requires regular patching in Guix, I agree that we have to remove it at
some point.

If however a change in Guix breaks packages, that change should get
rolled back / reverted and fixed, so it does not break the packages.

8 |   ocaml-migrate-parsetree
      ^^^^^^^^^^^^^^^^^^^^^^^
Error: Library "ocaml-migrate-parsetree" not found.

This likely means that a change in the inherited package removed the
input, but the breakage wasn’t detected.

And that’s actually what happened in
386ad7d8d14dee2103927d3f3609acc63373156a
Fri Jan 13 10:54:36 2023 +0000

This commit broke ocaml4.07-ppxlib by cleaning up the inputs of
ocaml-ppxlib (not naming names, this is not about shaming but about
detecting the deeper problem).

It should have been rejected (somehow) by CI. The change it would have
required is this:

diff --git a/gnu/packages/ocaml.scm b/gnu/packages/ocaml.scm
index 8ff755aea9..042432be9a 100644
--- a/gnu/packages/ocaml.scm
+++ b/gnu/packages/ocaml.scm
@@ -6845,6 +6845,9 @@ (define-public ocaml4.07-ppxlib
          (base32
           "0my9x7sxb329h0lzshppdaawiyfbaw6g5f41yiy7bhl071rnlvbv"))))
      (build-system dune-build-system)
+     (propagated-inputs
+      (modify-inputs (package-propagated-inputs ocaml-ppxlib)
+        (prepend ocaml-migrate-parsetree)))
      (arguments
       `(#:phases
         (modify-phases %standard-phases

So for both the cases you named for removal, such a removal would have
caused us to miss actual problems in our process.

This does not mean that there will never be a case in which a package
has to be removed, but given that both cases you showed are likely
self-induced breakage due to changes that should have been rejected as
breaking seemingly unrelated packages, it rather looks like the
situation where removing the package is the right way forward is the
exceptional case.

The norm is that our CI should have detected a problem in the commit
causing the breakage.
(this is reasoning from only two datapoints, so take it with a grain of
salt …)

Can we automatically rebuild all inheriting packages when a package gets
changed?

> The usual situation is that CI is able to build the packages.  The set
> of packages that CI is not able to build is very limited and it is the
> exception.
>
> Having a rule to deal with the regular broken packages appears to me a
> good thing and very helpful to keep Guix reliable.  And that rule cannot
> be based on rare exceptional cases.

A rule should work with known cases, otherwise it causes known breakage.

Also see above: in the two cases you selected, removing the package
would be the wrong path forward.

> On Wed, 30 Aug 2023 at 12:39, "Dr. Arne Babenhauserheide" <arne_bab@web.de> 
> wrote:
>> If a change in packages breaks my manifest, that is extremely painful.
>
> Yeah, and such rule for dealing with broken packages will be helpful for
> detecting such change and so avoid such situation.

Since a manifest is strictly dependent on all packages defined in it,
removing a single referenced package means that the manifest is broken:
no update works anymore. No security updates come in anymore — even if
the package in question worked locally. This is a situation we should
not cause.

If we had a way to have placeholder packages (similar to the renamings)
that emit warnings for missing packages but do not break the build, that
would reduce the damage done by removing a package. But I think such a
mechanism must be in place and tested before adding a rule to remove
packages.

And as we’ve seen from the two packages you selected, removal wouldn’t
have been the right decision.

The more important question is (serious question and *not* for assigning
blame, but to see whether we can improve processes): with the time we
already spent in this discussion, we could have fixed a lot of packages.
Why did we not do that?

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]