guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#62153] [PATCH 1/2] guix: docker: Build layered image.


From: Oleg Pykhalov
Subject: [bug#62153] [PATCH 1/2] guix: docker: Build layered image.
Date: Tue, 14 Mar 2023 00:10:56 +0300
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Hi Simon,

Thank you for the review.

Simon Tournier <zimon.toutoune@gmail.com> writes:

> On lun., 13 mars 2023 at 03:33, Oleg Pykhalov <go.wigust@gmail.com> wrote:
>
>> diff --git a/gnu/packages/aux-files/python/stream-layered-image.py 
>> b/gnu/packages/aux-files/python/stream-layered-image.py
>> new file mode 100644
>> index 0000000000..9ad2168c2d
>> --- /dev/null
>> +++ b/gnu/packages/aux-files/python/stream-layered-image.py
>> @@ -0,0 +1,391 @@
>> +"""
>> +This script generates a Docker image from a set of store paths. Uses
>> +Docker Image Specification v1.2 as reference [1].
>
> Instead of Python, would it possible to implement in Guile?  I mean,
> does Python have something that is missing in Guile?
>
> The facility for manipulating Tar?  Something else?

I think nothing else.  As I understand Python implemented Tar inside the
language itself in 2500 lines of code by manipulating binary data.

    /gnu/store/...-python-3.9.9/lib/python3.9/tarfile.py

Technically it's probably possible to use tar utility with --append flag
instead of opening a new file and streaming to it as the Python script
does.  To be honest I would like not to write it in this way if the
Python script does not block current patch for merge.

Also I don't see myself writing Tar implementation in Guile, yet.  ;-)

The Nix project uses this script heavily to build layered images, so it
should be robust in terms of up to date to current Tar and Python
implementations.

> Because then, if I understand correctly…
>
>> diff --git a/guix/docker.scm b/guix/docker.scm
>> index 5e6460f43f..f1adad26dc 100644
>> --- a/guix/docker.scm
>> +++ b/guix/docker.scm
>
> [...]
>
>> +      (if stream-layered-image
>> +          (let ((input (open-pipe* OPEN_READ "python3"
>> +                                   stream-layered-image
>> +                                   "config.json")))
>
> …it requires to drag Python for building/packing layered Docker.

Correct.

> Well, I have not really look yet to the Python script which does most of
> the job.  Do you use a similar strategy as [1]?
>
> And I remember something in that direction by Chris but I am unable to
> find back the patch. )-:
>
> 1: https://grahamc.com/blog/nix-and-layered-docker-images/

Not similar.  My patch implements a very simple sorting by size, no
complex sorting by reference popularity as in [1], which is probably
implemented in the following file

   
github.com/NixOS/nixpkgs/pkgs/build-support/references-by-popularity/closure-graph.py

From https://grahamc.com/blog/nix-and-layered-docker-images/ article:

> How Docker really represents an Image
>
> Docker’s layers are content addressable and aren’t required to
> explicitly reference a parent layer. This means a layer for
> readline-7.0p5 doesn’t have to mention that it has any relationship to
> ncurses-6.1 or glibc-2.27 at all.
>
> Instead each image has a manifest which defines the order:
>
> {
>   "Layers": [
>     "bash-interactive-4.4-p23",
>     "bash-4.4p23",
>     "readline-7.0p5",
>      ...
>   ]
> }
>
> If you have only built Docker images using a Dockerfile, then you
> would expect the way we flatten our graph to be critically
> important. If we sometimes picked readline-7.0p5 to come first and
> other times picked bash-4.4p23 then we may never make cache hits.
>
> However since the Image defines the order, we don’t have to solve this
> impossible problem: we can order the layers in any way we want and the
> layer cache will always hit.

In case of sorting by size, bigest layers will be on top of a container
image, which will produce a cache hit for bigest directories in the GNU
store during images transfer with same layers.

I would like to say this sorting could binifit more than sorting by
popularity during transfer but let's assume I didn't write it.  ;-)

The following example shows common layers between images, which will be
not tranfered if you load image inside Docker as well as pull and push:

    ./pre-inst-env guix pack -f docker-layered --entry-point=bin/bash -S 
/bin=bin bash hello

and

    ./pre-inst-env guix pack -f docker-layered --entry-point=bin/bash -S 
/bin=bin bash hello emacs

share 6 layers in total

--8<---------------cut here---------------start------------->8---
$ f() { docker image inspect "$1" | jq --raw-output '.[0].RootFS.Layers[] | .' 
| sort ; }
$ comm -1 -2 --total <(f 
sha256:fb43b32380a5e6a867410721f4ce2917db14d4ae943c433983afbaf84416c421) <(f 
sha256:0ce4a11973d1071aeec5441db228d6148dfd09fea3ae77b731c750ebfcc2fe1d)
sha256:3b3daa2a00f1acd12eeb16698bf1caeb6ba6c436e3dbca6259c3a9c622664e00
sha256:5c2be7469293854257221cb6aa8aa4af1e10e2c550935390dbcfeede3d3fbacd
sha256:60317981d94928659389f299e4b86703e5ded420a53537d67627952187fbd3f9
sha256:6d7c8ce5441d4c4c74e0ecff6c203a7b265b37137cca3b0a0ccf10526cfaa6e2
sha256:c2ded2ffe3f46fa7a64a62e0fc6b9d28cb7d4f8d9c64d5a52d137a508cba11fc
sha256:fbcad85d7d3c25bd2aa6d95bb3bf3d02c499ee3b3e443ddd3e5b679c2b33c139
5       94      6       total
--8<---------------cut here---------------end--------------->8---

Regards,
Oleg.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]