guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: File search progress: database review and question on triggers


From: Ludovic Courtès
Subject: Re: File search progress: database review and question on triggers
Date: Mon, 12 Oct 2020 12:20:43 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

Hi!

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

>> Could you post a summary of what you have done, what’s left to do, and
>> how you’d like to integrate it?  (If you’ve already done it, my
>> apologies, but you can resend a link.  :-))
>
> What I've done: mostly a database benchmark.
>
> - Textual database: slow and not lighter than SQLite.  Not worth it I believe.
>
> - SQLite without full-text search: fast, supports classic patterns
>   (e.g. "foo*bar") but does not support word permutations.
>
> - SQLite with full-text search: fast, supports word permutations but
>   does not support suffix-matching (e.g. "bar" won't match "foobar").
>   Size is about the same as without full-text search.
>
> - Include synopsis and descriptions.  Maybe we should include all fields
>   that are searched by `guix search`.  This incurs a cost on the
>   database size but it would fix the `guix search` speed issue.  Size
>   increases by some 10 MiB.

Oh so this is going beyond file search, right?

Perhaps it would make sense to focus on file search only as a first
step, and see what can be done with synopses/descriptions (like Arun and
zimoun did before) later, separately?

> What's left to do:
>
> - Populate the database on demand, either after a `guix build` or from a
>   `guix filesearch...`.  This is important so that `guix filesearch`
>   works on packages built locally.  If `guix build`, I need help to know
>   where to plug it in.
>
> - Adapt Cuirass so that it builds its file database.
>   I need pointers to get started here.
>
> - Sync the databases from the substitute server to the client when
>   running `guix filesearch`.  For this I suggest we send the compressed
>   database corresponding to a guix generation over the network (around
>   10 MiB).  Not sure sending just the delta is worth it.

It would be nice to see whether/how this could be integrated with
third-party channels.  Of course it’s not a priority, but while
designing this feature, we should keep in mind that we might want
third-party channel authors to be able to offer such a database for
their packages.

> - Find a way to garbage-collect the database(s).  My intuition is that
>   we should have 1 database per Guix checkout and when we `guix gc` a
>   Guix checkout we collect the corresponding database.

If we download a fresh database every time, we might as well simply
overwrite the one we have?

Thanks,
Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]