bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#62422: IRC channel log search results are not chronological for rece


From: Hugo Buddelmeijer
Subject: bug#62422: IRC channel log search results are not chronological for recent logs
Date: Fri, 24 Mar 2023 16:38:44 +0100

Hi all, Ricardo,

Searching through the IRC channel logs on https://logs.guix.gnu.org/ will show a list of matches sorted on date in descending order, except for matches from this February or March, those are at the bottom, often beyond the 100 match limit.

For example, 'vdirsyncer' results in 31 matches (at the time of writing): https://logs.guix.gnu.org/guix/search?query=vdirsyncer

> 2023-01-10 [15:09:09] <elb> this machine has installed emacs, emacs-guix, ...
> 2023-01-10 [15:12:26] <nckx> For context, ‘guix size emacs emacs-guix ...
> 2022-01-18 [04:43:24] <lfam> At least, vdirsyncer builds when you simply ...
> 2022-01-17 [16:29:41] <johnhamelink> Hey there :) I'm currently an Arch ...
> 2020-11-30 [23:29:57] <lfam> jonsger: No, I'm not using radicale. It was ...
> 2020-11-30 [23:31:08] <lfam> sneek: later tell jonsger: No, I'm not using ...
> 2020-04-29 [09:34:26] <efraim> it also came up in vdirsyncer on ...
> ...
> 2016-01-24 [22:45:28] <lfam> I don't even think you can run vdirsyncer ...
> 2015-12-10 [00:10:51] <lfam> All that and vdirsyncer doesn't even build ...
> 2015-12-09 [22:39:51] <lfam> https://github.com/untitaker/vdirsyncer/ ...
> 2023-02-25 [03:03:54] <fruit-loops> "#61557 - vdirsyncer fails to verify ...
> 2023-02-25 [03:08:01] <fruit-loops> "vdirsyncer fails to verify ...
> 2023-02-25 [03:09:41] <elb> nckx: hmmm when I searched mobile ...
> 2023-02-25 [03:10:49] <elb> ok yeah, it's just not tagged or ...
> 2023-02-25 [03:36:53] <fruit-loops> "vdirsyncer fails to verify ...
> 2023-02-25 [03:38:16] <elb> lechner: no, against vdirsyncer
> 2023-02-25 [03:46:06] <fruit-loops> "vdirsyncer fails to verify certificates"

All hits from February and March of this year are at the bottom of the list, while the rest is in chronological order. (The 'vdirsyncer' example was chosen because it occurs regularly, but not too often.) The list cuts off after about 100 matches, so it is impossible to find recent matches for more popular terms.The most recent chats are usually more interesting, for example when debugging an issue that occured recently. E.g. a search for Python shows nothing beyond 2023-01-31:
https://logs.guix.gnu.org/guix/search?query=python

So my question is, can we improve the sort order of the IRC logs?


I did a bit of investigating myself and discovered the maintenance repository with the hydra directory. There is so much to learn from that directory.

However, I could not really figure out what could be the problem. My hypothesis, which is more like a wild guess:
- It seems the sorting is done implicitly by xapian, which will just return the matching lines in whatever order they are inserted.
- Something went wrong at the transition between January 31th and February 1th, that required manual cleanup. Evidence: there are logs with a tilde in the filename, 2023-01-31.log~ and 2023-02-01.log~.
- The database was emptied and repopulated to prevent entries from early in the morning of 2023-02-01 to be counted as beyond-midnight on 2023-01-31. This put all the lines in the correct order, hence correct sorting up till then.
- Subsequent lines are added with the mcron job and are therefore at the end of the database, and thus at the end of the result set (beyond the limit of 100).

Side note: the ~ files cause some lines to show up three times, e.g.
https://logs.guix.gnu.org/guix/search?query=557816d497d3e9d25901370903d512d6f6991aa3
> 2023-01-31 [04:52:19] <jgart[m]> dcunit3d: here's another great config: https://github.com/jsoo1/dotfiles/blob/557816d497d3e9d25901370903d512d6f6991aa3/emacs/init.el
> 2023-02-01.log~ [04:52:19] <jgart[m]> dcunit3d: here's another great config: https://github.com/jsoo1/dotfiles/blob/557816d497d3e9d25901370903d512d6f6991aa3/emacs/init.el
> 2023-01-31.log~ [04:52:19] <jgart[m]> dcunit3d: here's another great config: https://github.com/jsoo1/dotfiles/blob/557816d497d3e9d25901370903d512d6f6991aa3/emacs/init.el

Side side note: those ~ entries cannot be clicked on, because (define stamp (basename file-name ".log")) lets goggles think that the ".log" is part of the date.

What I don't understand is why the matches are not sorted correctly. It seems to me that (Enquire-set-sort-by-value enq 0 #f) would sort by the value of slot 0, which seems to be the date-stamp. But I don't really have a good mental model of how xapian works or what value slots actually are. (Maybe value slots start at 1 and selecting 0 means do not use any of them?)

I tried to compare the results of #guix with those of other channels, but it seems that the logs of most other channels are either not indexed at all, or inconsistently. For example, searching for ACTION (which is a "/me" command it seems) in #spritely shows only 11 matches spread over 5 days, while it is a very common occurrence:
https://logs.guix.gnu.org/spritely/search?query=ACTION

Cheers,
Hugo








reply via email to

[Prev in Thread] Current Thread [Next in Thread]