[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A couple of lisp questions
Re: A couple of lisp questions
12 Nov 2003 19:00:34 +0000
Gnus/5.09 (Gnus v5.9.0) Emacs/21.2.93
>>>>> "Stefan" == Stefan Monnier <address@hidden> writes:
Stefan> Take a look at how flyspell does it. Or maybe auto-fill.
>> I will. I think auto-fill cheats though, as its tied directly in
>> to the command loop. I seem to remember reading that somewhere.
Stefan> Not the command loop, just the self-command-insert command
Stefan> (which is implemented in C).
Yes, you are right. I was a little confused by
"In general, this is the only way to do that, since the facilities for
customizing `self-insert-command' are limited to special cases
(designed for abbrevs and Auto Fill mode). Do not try substituting
your own definition of `self-insert-command' for the standard one.
The editor command loop handles this function specially."
So auto-fill is tied in slightly indirectly.
Stefan> You can hijack the auto-fill-function for your own
Stefan> non-auto-fill use.
I would not want it to interfere with auto-fill though. I think I have
it working reasonably well know.
>> usage-hash: "the" --> ("the" . 4) "and" --> ("and" . 6)
Stefan> Why not just
Stefan> "the" --> 4 "and" --> 6
it makes no difference. The suffix hash must contain cons cells, and I
share them with this. For the usage hash, you are correct, the car of
the cons cell is not used.
>> Then a suffix hash
>> suffix-hash: "t" --> (("the" . 4) ("then" . 3) ("talk" . 2) etc)
>> "th" --> (("the" . 4) etc ) "the" --> (("the" . 4) etc )
Stefan> Is `try-completion' too slow (because the usage-hash is too
Stefan> large?) to build the suffixes on the fly ?
I'm not convinced it does what I want. Perhaps I am wrong.
When the letter "t" is pressed I get an alist back. The alist is
actually ordered, with the most commonly occurring words first. So I
pick the preferred usage straight of the front. So I have constant
time access to the hash, and constant time access to the list.
Updating takes a bit longer....
>> In this case the cons cells for each word are shared between the
>> hashes, so this is not a massive memory waste as the written
>> version appears.
Stefan> Each word of N letters has:
Stefan> - one string (i.e. N + 16 bytes)
Stefan> - one cons-cell (8 bytes)
Stefan> - one hash-table entry (16 bytes)
Stefan> in usage-hash, plus:
Stefan> - N cons-cells (N*8 bytes)
Stefan> - N hash entries shared with other words (at least 16 btes).
Stefan> For a total of 9*N + 56 bytes per word. Probably not a big
Well there are other reasons as well. When I update the cons in the
usage, its automatically "update" in the suffix hash as well. That was
the main reason.
>> Ideally I would want to build up these word usage statistics as
>> they are typed, but as you say its hard to do this. I think a
>> flyspell like approach combined with text properties should work
Stefan> How do you avoid counting the same instance of a word
Stefan> several times? Oh, you mark them with a text-property, I
Stefan> see. More like font-lock than flyspell.
>> The serialization would be to enable saving across sessions. Most
>> of the packages I know that do this depend on their objects
>> having a read syntax, which doesn't work with hashes. I think the
>> solution here is to convert the thing into a big alist to save
>> it, and then reconstruct the hashes on loading.
Stefan> Why not reconstruct the suffix upon loading? This way you
Stefan> have no sharing to worry about and you can just dump the
Stefan> hash via maphash & pp.
Yes, I think that's going to be my plan. Normally I sort the alist in
the suffix hash after every update, but if I disable this, and then do
them all at once, it should be quicker....
>> Anyway the idea for all of this was to do a nifty version of
>> abbreviation expansion, something like dabbrev-expand, but
>> instead of searching local buffers, it would grab word stats as
>> its going, and use these to offer appropriate suggestions. I was
>> thinking of a user interface a little bit like the buffer/file
>> switching of ido.el, of which I have become a committed user.
Stefan> Sounds neat.
>> the way, building an decent UI around this will probably take 10
>> times as much code!
Stefan> And even more time,
I've almost got a nasty version (where you build the dictionary
explicitly rather than automatically) working.