[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

03/03: website: Add post about lzipped substitutes.

From: Ludovic Courts
Subject: 03/03: website: Add post about lzipped substitutes.
Date: Mon, 17 Jun 2019 08:51:55 -0400 (EDT)

civodul pushed a commit to branch master
in repository guix-artwork.

commit fafd36a62ef00aadc7ba1b3988215286f689d893
Author: Ludovic Courtès <address@hidden>
Date:   Mon Jun 17 14:24:21 2019 +0200

    website: Add post about lzipped substitutes.
    * website/posts/ New file.
 website/posts/ | 245 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 245 insertions(+)

diff --git a/website/posts/ b/website/posts/
new file mode 100644
index 0000000..ebd4c44
--- /dev/null
+++ b/website/posts/
@@ -0,0 +1,245 @@
+title: Substitutes are now available as lzip
+date: 2019-06-17 14:30
+author: Ludovic Courtès
+tags: Scheme API
+For a long time, our build farm at has been delivering
+(pre-built binaries) compressed with gzip.  Gzip was never the best
+choice in terms of compression ratio, but it was a reasonable and
+convenient choice: it’s rock-solid, and zlib made it easy for us to have
+to perform in-process compression in our multi-threaded [`guix
+With the exception of building software from source, downloads take the
+most time of Guix package upgrades.  If users can download less,
+upgrades become faster, and happiness ensues.  Time has come to improve
+on this, and starting from early June, Guix can publish and fetch
+[lzip]( substitutes, in addition to
+# Lzip
+[Lzip]( is a relatively little-known
+compression format, initially developed by Antonio Diaz Diaz ca. 2013.
+It has several C and C++ implementations with surprisingly few lines of
+code, which is always reassuring.  One of its distinguishing features is
+a very good compression ratio with reasonable CPU and memory
+requirements, [according to benchmarks published by the
+[Lzlib]( provides a well-documented C
+interface and Pierre Neidhardt set out to write bindings for that
+library, which eventually landed as the [`(guix lzlib)`
+With this in place we were ready to start migrating our tools, and then
+our build farm, to lzip compression, so we can all enjoy smaller
+downloads.  Well, easier said than done!
+# Migrating
+The compression format used for substitutes is not a core component like
+it can be in “traditional” binary package formats [such as
+`.deb`]( since Guix is conceptually a
+“source-based” distro.  However, deployed Guix installations did not
+support lzip, so we couldn’t just switch our build farm to lzip
+overnight; we needed to devise a transition strategy.
+Guix asks for the availability of substitutes over HTTP.  For example, a
+question such as:
+> “Dear server, do you happen to have a binary of
+> `/gnu/store/6yc4ngrsig781bpayax2cg6pncyhkjpq-emacs-26.2` that I could 
+translates into prose to an HTTP GET of
+which returns something like:
+StorePath: /gnu/store/6yc4ngrsig781bpayax2cg6pncyhkjpq-emacs-26.2
+URL: nar/gzip/6yc4ngrsig781bpayax2cg6pncyhkjpq-emacs-26.2
+Compression: gzip
+NarHash: sha256:0h2ibqpqyi3z0h16pf7ii6l4v7i2wmvbrxj4ilig0v9m469f6pm9
+NarSize: 134407424
+References: 2dk55i5wdhcbh2z8hhn3r55x4873iyp1-libxext-1.3.3 …
+FileSize: 48501141
+System: x86_64-linux
+Deriver: 6xqibvc4v8cfppa28pgxh0acw9j8xzhz-emacs-26.2.drv
+Signature: 1;;KHNpZ25hdHV…
+(This narinfo format is inherited from [Nix]( and
+This tells us we can download the actual binary from
+`/nar/gzip/…-emacs-26.2`, and that it will be about 46 MiB (the
+`FileSize` field.)  This is what `guix publish` serves.
+The trick we came up with was to allow `guix publish` to advertise
+several URLs, one per compression format.  Thus, for recently-built
+substitutes, we get something [like
+StorePath: /gnu/store/mvhaar2iflscidl0a66x5009r44fss15-gimp-2.10.12
+URL: nar/gzip/mvhaar2iflscidl0a66x5009r44fss15-gimp-2.10.12
+Compression: gzip
+FileSize: 30872887
+URL: nar/lzip/mvhaar2iflscidl0a66x5009r44fss15-gimp-2.10.12
+Compression: lzip
+FileSize: 18829088
+NarHash: sha256:10n3nv3clxr00c9cnpv6x7y2c66034y45c788syjl8m6ga0hbkwy
+NarSize: 94372664
+References: 05zlxc7ckwflz56i6hmlngr86pmccam2-pcre-8.42 …
+System: x86_64-linux
+Deriver: vi2jkpm9fd043hm0839ibbb42qrv5xyr-gimp-2.10.12.drv
+Signature: 1;;KHNpZ25hdHV…
+Notice that there are two occurrences of the `URL`, `Compression`, and
+`FileSize` fields: one for gzip, and one for lzip.  Old Guix instances
+will just pick the first one, gzip; newer Guix will pick whichever
+supported method provides the smallest `FileSize`, usually lzip.  This
+will make migration trivial in the future, should we add support for
+other compression methods.
+Users need to upgrade their Guix daemon to benefit from lzip.  On a
+“foreign distro”, simply run `guix pull` as root.  On standalone Guix
+systems, run `guix pull && sudo guix system reconfigure
+/etc/config.scm`.  In both cases, the daemon has to be restarted, be it
+with `systemctl restart guix-daemon.service` or with `herd restart
+# First impressions
+This new gzip+lzip scheme has been deployed on for a
+week.  Specifically, we run `guix publish -C gzip:9 -C lzip:9`, meaning
+that we use the highest compression ratio for both compression methods.
+Currently, only a small subset of the package substitutes are available
+as both lzip and gzip; those that were already available as gzip have
+not been recompressed.  The following Guile program that taps into the
+API of [`guix
+allows us to get some insight:
+(use-modules (gnu) (guix)
+             (guix monads)
+             (guix scripts substitute)
+             (srfi srfi-1)
+             (ice-9 match))
+(define all-packages
+  (@@ (guix scripts weather) all-packages))
+(define package-outputs
+  (@@ (guix scripts weather) package-outputs))
+(define (fetch-lzip-narinfos)
+  (mlet %store-monad ((items (package-outputs (all-packages))))
+    (return
+     (filter (lambda (narinfo)
+               (member "lzip" (narinfo-compressions narinfo)))
+             (lookup-narinfos ""; items)))))
+(define (lzip/gzip-ratio narinfo)
+  (match (narinfo-file-sizes narinfo)
+    ((gzip lzip)
+     (/ lzip gzip))))
+(define (average lst)
+  (/ (reduce + 0 lst)
+     (length lst) 1.))
+Let’s explore this at the
+scheme@(guile-user)> (define lst
+                       (with-store s
+                         (run-with-store s (fetch-lzip-narinfos))))
+computing 9,897 package derivations for x86_64-linux...
+updating substitutes from ''... 100.0%
+scheme@(guile-user)> (length lst)
+$4 = 2275
+scheme@(guile-user)> (average (map lzip/gzip-ratio lst))
+$5 = 0.7398994395478715
+As of this writing, around 20% of the package substitutes are
+available as lzip, so take the following stats with a grain of salt.
+Among those, the lzip-compressed substitute is on average 26% smaller
+than the gzip-compressed one.  What if we consider only packages bigger
+than 5 MiB uncompressed?
+scheme@(guile-user)> (define biggest
+                       (filter (lambda (narinfo)
+                                 (> (narinfo-size narinfo)
+                                    (* 5 (expt 2 20))))
+                               lst))
+scheme@(guile-user)> (average (map lzip/gzip-ratio biggest))
+$6 = 0.5974238562384483
+scheme@(guile-user)> (length biggest)
+$7 = 440
+For those packages, lzip yields substitutes that are 40% smaller on
+average.  Pretty nice!  Lzip decompression is slightly more
+CPU-intensive than gzip decompression, but downloads are
+bandwidth-bound, so the benefits clearly outweigh the costs.
+# Going forward
+The switch from gzip to lzip has the potential to make upgrades “feel”
+faster, and that is great in itself.
+Fundamentally though, we’ve always been looking in this project at
+peer-to-peer solutions with envy.  Of course, the main motivation is to
+have a community-supported and resilient infrastructure, rather than a
+centralized one, and that vision goes [hand-in-hand with reproducible
+We started working on [an extension to publish and fetch
+substitutes]( over
+[IPFS](  Thanks to its content-addressed nature, IPFS
+has the potential to further reduce the amount of data that needs to be
+downloaded on an upgrade.
+The good news is that IPFS developers are also [interested in working
+with package manager
+developers](, and I bet
+there’ll be interesting discussions at [IPFS
+Camp]( in just a few days.  We’re eager to pursue
+our IPFS integration work, and if you’d like to join us and hack the
+good hack, [let’s get in
+#### About GNU Guix
+[GNU Guix]( is a transactional package
+manager and an advanced distribution of the GNU system that [respects
+Guix can be used on top of any system running the kernel Linux, or it
+can be used as a standalone operating system distribution for i686,
+x86_64, ARMv7, and AArch64 machines.
+In addition to standard package management features, Guix supports
+transactional upgrades and roll-backs, unprivileged package management,
+per-user profiles, and garbage collection.  When used as a standalone
+GNU/Linux distribution, Guix offers a declarative, stateless approach to
+operating system configuration management.  Guix is highly customizable
+and hackable through [Guile](
+programming interfaces and extensions to the
+[Scheme]( language.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]