guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#39258] [PATCH v4 0/3] Faster cache generation (similar as v3)


From: zimoun
Subject: [bug#39258] [PATCH v4 0/3] Faster cache generation (similar as v3)
Date: Sun, 3 May 2020 20:10:10 +0200

Hi Ludo,

On Sun, 3 May 2020 at 18:43, Ludovic Courtès <address@hidden> wrote:

> > Therefore the cache '/lib/guix/package.cache' contains more
> > information.
>
> This breaks the binary interface, so we’ll have to analyze the impact of
> such a change and devise a strategy.

Interface between what and what?

Because from my understanding, this file is only used by only one
guix.  What do I miss?


Note that I have read your comment in v3 2/3 but I did not understand it. Sorry.

--8<---------------cut here---------------start------------->8---
I realize the other cache also has that problem, but it would be nice to
add a version tag to the cache.  Basically emit something like:

  (package-metadata-cache (version 0) VECTOR …)

instead of just:

  (VECTOR …)
--8<---------------cut here---------------end--------------->8---



> > (The v4 structure of 'package.cache' is a quick draft, so details
> > should be discussed and an interesting move should to have a
> > structured (binary and all strings) S-exp; because it should become an
> > entry point to export the packages list to JSON.  WDYT?)
>
> It’s on purpose that this cache is an object file: it just needs to be
> mmap’d, and that’s it.  It’s the cheapest possible way to do it.
> Parsing sexps would be more costly, and since we’re talking about
> startup time, this is sensitive.

I agree and I have badly worded or I misunderstand something.
For example, 'supported-systems' is saved as a list of strings,
whereas 'license' is expanded as 3 strings without be packed in a list
of strings.  From my point of view, it is inconsistent and I do not
know what is the best (readibility, startup time, etc.).


> > To be clear about BM25 and caching, what I have in mind is:
> >   1. "guix search --build-index" optionally done by the user if they wants 
> > for example the BM25 ranking.
>
> Something that must be done explicitly doesn’t seem great to me.  As a
> user, I’d rather not think about search indexes and all.  But I don’t
> know, maybe if it happened automatically on the first ‘guix search’
> invocation that’d be fine.

I do not think it is an option to build the BM25 the first time "guix
search" is called.  Back-to-envelop estimation, it needs ~25 seconds
to Xapian* to do so.

>From my point of view, two options:
 a) "guix pull" does this extra ~25 seconds (compared to 10 seconds to
build the v4 cache)
 b) the user manually build the index (I agree it is awkward!)

Well, the first question is to evaluate if it is worth -- I am using
the v2 version based on Xapian to have an idea.  Please if you have
suggestions about query (terms an user could type) and results
(packages an user could expect), there are welcome.


*Xapian: I do not think we could do better but I have not checked yet
if there is a bottleneck Guix, Guile-Xapian and Xapian.


> >  1. The name of 'fold-packages*' should be misleading since it does not 
> > return "true" packages.
>
> Did you see ‘fold-available-packages’?  It seems you could extend it
> instead of introducing ‘fold-packages*’, no?

Yes and no.

 a) 'fold-available-packages' requires to modify the 'lambda' in
'find-package-by-description',
 b) 'fold-package*' returning a 'package' is less tweaks, IMHO.

Well, I agree that on the long term, what 'fold-package*' does could
be done by 'fold-available-packages' with the adequate 'proc'.

Thank you for the suggestion; even if once re-read correctly v3 2/3
you already mentioned it. :-)


> >  2. The function 'package->recutils' in 'guix/ui.scm' is modified but it is 
> > not the better.
> >
> >           (match (package-supported-systems p)
> >             (('cache supported-systems)
> >              (string-join supported-systems))
> >             (_
> >              (string-join (package-transitive-supported-systems p)))))
> >
> >     However it avoids to duplicate code; as it is done in version v3.
>
> I made suggestions to Arun’s v3 about the API here.  Essentially, I
> think I proposed having a procedure that takes the list of fields as
> keyword parameters, and ‘package->recutils’ would just delegate to that.

Yes, it was already your suggestion in v3 3/3.  Do you suggest to
refactor 'package->recutils'? For example,

--8<---------------cut here---------------start------------->8---
(define* (package->recutils name version
                            ... all-the-other-fields ...
                            port #:optional (width (%text-width))
                            #:key
                            (hyperlinks? (supports-hyperlinks? port))
                            (extra-fields '()))
--8<---------------cut here---------------end--------------->8---



> >  4. Impolite '@@' is used to access the private license construction.
>
> (guix licenses) could provide a ‘string->license’ procedure.

Well, do you suggest:

    (define (string->license name) (license name #f #f))

? Skipping 'uri' and 'comment'?  Naive question: what is the purpose
of these 2 fields?  Because there are not exposed at the CLI level,
AFAIK, and I do not think an user evaluate '(license-uri pkg)' in a
script.

Well, I think that the hyperlink feature could be used to display the
license URI too.  WDYT?



> Stopping here for now because I’m sorta drowning in patch review.  :-)

Thank you for all the comments.


> Thanks for exploring this design space, we’re making progress!

My pleasure. Scheme is designed to explore. ;-)


Cheers,
simon





reply via email to

[Prev in Thread] Current Thread [Next in Thread]