[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A couple of lisp questions
Re: A couple of lisp questions
12 Nov 2003 16:29:15 +0000
Gnus/5.09 (Gnus v5.9.0) Emacs/21.2.93
>>>>> "Stefan" == Stefan Monnier <address@hidden> writes:
>> First, I want to call a function everytime a new word has been
>> typed into a buffer. The only way that I can think of doing this
>> at the
Stefan> As you can imagine there's no perfect answer here, since
Stefan> words can be typed piecemeal, or backwards, or split into
Stefan> two, or joined, or modified in some other way. So it's not
Stefan> even clear what you mean by "everytime a new word has been
This is the case. I've been thinking about this a bit more, and think
I have a somewhat better solution, which is to move somewhat behind
point, gathering words, and then mark them up with text properties to
say that they have been found.
>> This does not work in all cases, so better ideas would be
Stefan> Take a look at how flyspell does it. Or maybe auto-fill.
I will. I think auto-fill cheats though, as its tied directly in to
the command loop. I seem to remember reading that somewhere.
>> Second, my data structures are current using a hashtable, and a
>> set of lists. The hashtable has a nice feature which is key/value
>> weakness. I would really like to use this feature, but over an
>> ordered list structure rather than a hash. As far as I can tell
>> the only way I can use a weak reference is through the
>> hashtable. There are no other weak data structures?
>> Third, is there a good way of serializing hashtables, so that I
>> can load them again next time from a file? To get my system to
>> work I need multiple hashtables sharing the same objects not just
>> objects with the same values, so its fairly complicated.
Stefan> As you probably know, the answer to both is "no can do".
Stefan> But if you provide more info about what you're trying to do
Stefan> (rather than how you're trying to do it), maybe there's a
Stefan> good answer that does not involve the usual "patches
I'm building up a dictionary of words used as the user types, along
with word usage statistics.
I have two hashes like so....
usage-hash: "the" --> ("the" . 4)
"and" --> ("and" . 6)
which records the usages of a specific word.
Then a suffix hash
suffix-hash: "t" --> (("the" . 4) ("then" . 3) ("talk" . 2) etc)
"th" --> (("the" . 4) etc )
"the" --> (("the" . 4) etc )
which records suffixes of the words.
In this case the cons cells for each word are shared between the
hashes, so this is not a massive memory waste as the written version
Ideally I would want to build up these word usage statistics as they
are typed, but as you say its hard to do this. I think a flyspell like
approach combined with text properties should work okay.
Anyway the idea with the weakness is that I want to garbage collect
the dictionary periodically, throwing away old, or rarely used words.
But currently I have to keep the two hashes in sync by hand. I was
wondering whether it would be possible to use weakness to do this
automatically. But the second of the two hashes has values which are
in an alist, which would defeat this.
I'm not sure that this is too much of a problem. The performance that
I an getting from these data structures is fairly good.
The serialization would be to enable saving across sessions. Most of
the packages I know that do this depend on their objects having a read
syntax, which doesn't work with hashes. I think the solution here is
to convert the thing into a big alist to save it, and then reconstruct
the hashes on loading.
Anyway the idea for all of this was to do a nifty version of
abbreviation expansion, something like dabbrev-expand, but instead of
searching local buffers, it would grab word stats as its going, and
use these to offer appropriate suggestions. I was thinking of a user
interface a little bit like the buffer/file switching of ido.el, of
which I have become a committed user.
Its just an idea at the moment, with the basic data structures. As is
the way, building an decent UI around this will probably take 10 times
as much code! I think the chances are it will be to intrusive to be
off any use for most users. But you never know till you try.