bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash t


From: Eric Abrahamsen
Subject: bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash tables
Date: Tue, 26 Mar 2019 12:58:32 -0700
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

Andy Moreton <andrewjmoreton@gmail.com> writes:

> On Mon 25 Mar 2019, Eric Abrahamsen wrote:
>
>> Andy Moreton <andrewjmoreton@gmail.com> writes:
>>> Other notes from reading the code:
>>>
>>> 1) In `gnus-gnus-to-quick-newsrc-format' you ignore the contents of
>>>    `gnus-newsrc-alist' when saving "newsrc.eld", and replace it with the
>>>    details from `gnus-newsrc-hashtb'. Why ? The rest of the gnus code
>>>    appears to treat `gnus-newsrc-alist' as the single source of truth,
>>>    with the hash tables being used only for faster access to it.
>>
>> Eventually I would like to reduce the number of data structures so that
>> groups are held in `gnus-newsrc-hashtb', and ordering is kept in
>> `gnus-group-list', and that's it. `gnus-newsrc-alist' would only be used
>> when persisting to disk. My next proposed change (once I've recovered my
>> confidence) is to turn groups into actual objects, in which case the
>> alist would really just be a kind of serialization format.
>
> ok, but if the hash table and the alist reference the same lisp objects
> then the existing setup is not so bad.

There's nothing wrong with the existing setup here! I'm just laying the
groundwork for the next round of changes.

>> The hash table ought to be in sync with the rest of the data structure
>> -- if it isn't, that's another bug.
>>
>>> 2) In `gnus-gnus-to-quick-newsrc-format' you dropped the code to remove
>>>    the dummy group from `gnus-newsrc-alist'. Why ? This internal dummy
>>>    group is now saved in "newsrc.eld", which is not needed.
>>
>> This was an error. (Though in my case, I've had the dummy group in my
>> newsrc.eld for months, and it hasn't done any harm. I don't know why
>> it's necessary.)
>
> I agree that it's harmless, it just seemed to be an unnecessary change.
> I expect it is there to ensure that the alist is never empty to avoid
> the need to check that everywhere.

Yes, for sure, and I didn't mean to leave it in there, that's just a bug.

>>> 3) The format of the entries in `gnus-newsrc-hashtb' has changed,
>>> removing the second element. Why?
>>
>> Because the old `gnus-gethash' call returned a slice of
>> `gnus-newsrc-alist', where the second element was actually the group
>> *before* the group you wanted, and the third element was the cdr of
>> `gnus-newsrc-alist', starting with the group you wanted. This was
>> undocumented, and took a bit to figure out. Now, the gethash call just
>> gives you the group. Ideally, in the next set of changes, it will give
>> you an object.
>
> ok, so it sounds like the old code was trying hard to use the same lisp
> objects in both the hash table and the alist. That avoids alloacting
> twice the storage, and ensures that the hash table and the alist stay in
> sync.

Yes, I'm looking forward to when the groups are actual objects, . But in
the meantime, the lisp objects are still the same, I haven't doubled
storage (at least `eq' returns t, so I assume that's what that means).

>>> 4) You changed several hash tale sizesfrom 4096 to 4000, and 1024 to
>>> 1000. Why?
>>
>> My understanding is that using a prime number is significant when it
>> comes to vector access, but that the hash table implementation is
>> higher-level, where a prime number is no longer significant. If that's
>> incorrect I would like to know!
>
> None of these numbers are prime. The old numbers are powers of two,
> which are reasonable for allocation sizes on a binary machine. Again,
> not a terribly important change, but not really needed either.

Yes, sorry, powers of two. It's true it's a do-nothing change, I guess
I've been erring on the side of trying to make the code more
self-explanatory, less "odd". Probably this change didn't need to be
made, but it seems about as un-risky as a code change could be.

>>> Your patch contains several logical changes that would be easier to
>>> understand (and bisect) as a series of patches with one logical change
>>> in each patch:
>>>  - code layout changes
>>>  - add missing doc strings and code comments
>>>  - change hash table implementation
>>>  - change format of `gnus-newsrc-hashtb' entries
>>>  - change usage of `gnus-group-change-level'
>>>  - change coding of group names
>>> While it can take extra work to split things up, the end result is much
>>> easier to understand.
>>
>> In principle I agree with this completely. In practice I found it
>> extraordinarily difficult to touch one part of Gnus without running into
>> knock-on repercussions.
>
> Agreed. I find that this sort of work usually goes in two phases: some
> exploratory programming to decide on the path forward and the eventual
> goal, followed by rearranging the changes into a logical set of
> (bisectable) patches that get to that goal in small, self-contained
> steps. The second half is extra work, but worth it to make it easier for
> other people to review the patches (and to simplify any archaeology
> needed by a later maintainer).
>
> If changing gnus is hard, then perhaps writing new tests to document
> what you have discovered about the code will help to ensure that later
> changes do not change the observed behaviour.

The previous commit did add some new Gnus tests -- perhaps the first!
I'm planning more, once Gnus group are actual objects. Everything's so
interconnected now (and data-reliant) that it's very hard to write
effective tests.

>> The ultimate goal of the changes I have in mind for Gnus is to address
>> exactly this: to make it more modular, to improve isolation of code
>> paths, and to reduce the number of semi-redundant data structures. But
>> the process is evidently even messier than I thought. I held back
>> another commit to group name encoding in an attempt to keep things
>> simple, but that seems to have made things even worse.
>
> A worthwhile goal - please do not get discouraged.

Thank you,

Eric






reply via email to

[Prev in Thread] Current Thread [Next in Thread]