guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Identical files across subsequent package revisions


From: Ludovic Courtès
Subject: Re: Identical files across subsequent package revisions
Date: Wed, 06 Jan 2021 10:39:56 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

> It could go along these lines:
>
>   1. GET /digest/xyz-emacs; the digest contains a list of file/hash
>      pairs essentially;
>
>   2. traverse digest and hardlink the files already in /gnu/store/.links
>      to the target directory;
>
>   3. pipeline-GET the remaining files and install them.
>
> Like you write, since we already have deduplication, part of it comes
> for free.

I prototyped that during the holidays and pushed the result as
‘wip-digests’ (rough on the edges!).

It works essentially as written above.  ‘guix substitute’ fetches
digests at the same time as it fetches narinfos so it can keep them in
cache, which in turn means that when you install things it already has
info locally to determine which files it already has and which ones it
needs to fetch (in the best case, the substitute phase simply hard-links
files locally without connecting to the server!).

One of the open issues is compression.  ‘guix publish’ now serves
individual files, but obviously they should be compressed somehow.
For nars, we have /lzip, /gzip, etc. URLs, which gives clients more
freedom on the choice of compression but is also quite heavyweight.  For
individual files, I’m thinking we could use ‘Accept-Encoding’ to let the
client declare which compression formats it supports and then let ‘guix
publish’ choose the one it prefers.

In the worst case, when none of the files are available locally, we’d be
downloading all the individual files, which is probably more data than a
single compressed nar.  I guess ‘guix substitute’ could fall back to nar
download when too little is available locally.

The downside compared to using ERIS or IPFS is that it’s really a hack.
The upside is that it’s little work and it’s efficient since we already
have that content-addressed file store in /gnu/store/.links.

Thoughts?

Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]