[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: File search progress: database review and question on triggers
From: |
zimoun |
Subject: |
Re: File search progress: database review and question on triggers |
Date: |
Mon, 12 Oct 2020 13:23:13 +0200 |
On Mon, 12 Oct 2020 at 12:20, Ludovic Courtès <ludo@gnu.org> wrote:
>> - Textual database: slow and not lighter than SQLite. Not worth it I
>> believe.
>>
>> - SQLite without full-text search: fast, supports classic patterns
>> (e.g. "foo*bar") but does not support word permutations.
>>
>> - SQLite with full-text search: fast, supports word permutations but
>> does not support suffix-matching (e.g. "bar" won't match "foobar").
>> Size is about the same as without full-text search.
>>
>> - Include synopsis and descriptions. Maybe we should include all fields
>> that are searched by `guix search`. This incurs a cost on the
>> database size but it would fix the `guix search` speed issue. Size
>> increases by some 10 MiB.
>
> Oh so this is going beyond file search, right?
>
> Perhaps it would make sense to focus on file search only as a first
> step, and see what can be done with synopses/descriptions (like Arun and
> zimoun did before) later, separately?
Well, the first patch set that Arun sent for improving “guix search” was
the introduction of a SQLite database, replacing the current
’package.cache’. And I quote your wise advice:
I would rather keep the current package cache as-is instead of
inserting sqlite in here. I don’t expect it to bring much
compared performance-wise to the current simple cache
(especially if we look at load time), and it does increase
complexity quite a bit.
However, using sqlite for keyword search as you initially
proposed on guix-devel does sound like a great idea to me.
Message-ID: <87sgjhx92g.fsf@gnu.org>
Therefore, if Pierre is going to introduce a SQL database where the
addition of the synopses/descriptions is cheap, it seems a good idea to
use it, isn’t it? Keeping the ’package.cache’ as-is. And in parallel,
“we“ can try to use this WIP branch for improving the speed of “guix
search” (by “we”, I mean that I plan to work on).
BTW, somehow, it would be really easy to remove these 2 extra fields if
it is not concluding for search, since it is only the function
’add-files’:
--8<---------------cut here---------------start------------->8---
(with-statement
db
(string-append "insert into Info (name, synopsis, description, package)"
" values (:name, :synopsis, :description, :id)")
stmt
(sqlite-bind-arguments stmt
#:name name
#:synopsis synopsis
#:description description
#:id id)
--8<---------------cut here---------------end--------------->8---
and used only once by ’persist-package-files’.
> It would be nice to see whether/how this could be integrated with
> third-party channels. Of course it’s not a priority, but while
> designing this feature, we should keep in mind that we might want
> third-party channel authors to be able to offer such a database for
> their packages.
If the third-party channels also provides substitutes, then it would be
part of the substitutes, or easy to build from the substitute meta-data.
>> - Find a way to garbage-collect the database(s). My intuition is that
>> we should have 1 database per Guix checkout and when we `guix gc` a
>> Guix checkout we collect the corresponding database.
>
> If we download a fresh database every time, we might as well simply
> overwrite the one we have?
But you do not want to download it again if you roll-back for example.
>From my point of view, it should be the same mechanism as
’package.cache’.
Cheers,
simon
- Re: File search progress: database review and question on triggers, (continued)
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/10/12
- Re: File search progress: database review and question on triggers, Ludovic Courtès, 2020/10/13
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/10/13
- Re: File search progress: database review and question on triggers, Ludovic Courtès, 2020/10/13
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/10/14
- Re: File search progress: database review and question on triggers, Ludovic Courtès, 2020/10/16
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/10/17
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/10/17
- Re: File search progress: database review and question on triggers, Ludovic Courtès, 2020/10/21
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/10/21
- Re: File search progress: database review and question on triggers,
zimoun <=