[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] Improving reading mime email.

From: Jon Steinhart
Subject: Re: [Nmh-workers] Improving reading mime email.
Date: Thu, 19 May 2005 20:50:08 -0700

> >invoked when messages are inc'd, rmm'd, and refiled.  Part of my project,
> >grokmail, builds a real database from your mail messages.  So you can do
> By 'real database' do you mean a Berkeley/MySQL-type DB?  Is this sort
> of like supporting IMAP?

Actually, it's in Sleepycat, i.e., Berkeley DB.  It's a complicated setup.
Don't know anything about IMAP.

> >Of course, the real magic of grokmail is that you can train it by
> >ranking messages on a scale of 1-10 and then scan for interesting
> >messages.
> Like GNUS for Emacs?  That would be really cool.  Bayesian filtering of
> messages would be a funky feature, and not that hard to implement, from
> my PoV, e.g.:
>       * Nathan currently has 20 messages in his inbox.
>       * He reads the one from "My Boss" first.
>       * He then reads the one from "Brother in USA"
>       * He then deletes (without reading) the three with "Rx" in the
>         title (that somehow escaped the spam filter).
>       * &tc.
> => Emails from "My Boss" or "Brother in USA" should be highlighted /
> upgraded vs. emails with Rx in title should be downgraded.  All we'd
> need to do is build in some extra smarts in scan/show/rmm that
> monitored how we manage new mail against the mail corpus that is
> there.  Messages could then be tagged (annotated?) according to the
> learning, for use/manipulation by mh/Unix tools.
> re,
> N

Don't know anything about GNUS for Emacs.  But yes, it's a filtering
mechanism.  It's actually much harder to implement that you might
think.  Most Bayesian systems rely on occasional training when things
start going bad.  grokmail trains all of the time.  My client has close
to a half million messages in his mail folders, so it is non-trivial
to do this with reasonable performance.

I don't particularly agree with your usage model where reading a message
indicates that it's interesting.  Not a valid model from what I've looked

One of the reasons that I did the first implementation of this for nmh is
that since it isn't a monolithic mail system it is easy to add new commands
to do new things.  The main commands are grank for ranking, and then gpick
and gscan which are analogous to pick and scan.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]