nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Nmh-workers] indexing


From: Paul Vixie
Subject: [Nmh-workers] indexing
Date: Sat, 05 Feb 2011 02:39:10 +0000

hi all.  i havn't worked on mh in quite a while.  i sent kenh some patches
to use locking on the 'context' files back in 2003 and i see that those made
it in.  previously i guess it had been a REALLY long time since i hacked mh:

        To: address@hidden
        Subject: Getting MH 6.8.3 to run on Digital UNIX 3.2 (DEC OSF/1 3.2)
        Date: Thu, 14 Sep 1995 23:34:19 -0700
        From: Paul A Vixie <address@hidden>

so, today i send in an updated "port" for freebsd to boost nmh from 1.2->1.3,
and then i started thinking about my real problem which is the impedence
mismatch between MH and IMAP.  i've had a few discussions with mark crispin
about this and i've sent him a number of uw-imap/src/osdep/unix/mh.c patches
over the years and i've finally got an inkling about what needs to be done.

IMAP thinks in terms of message sequence numbers and also unique identifiers.
both are similar in appearance to MH message numbers but neither is an exact
fit.  an MH message number is not persistent, whereas IMAP UID's are -- even
if they are "nonsticky".  sorting a folder or refiling things out of a folder
in mh-e/nmh while imapd also has that folder open means desynchronization.
(i have cron jobs that do this, so it's not just a matter of self control.)
similarly, an MH message number sequence can have holes, whereas an IMAP
message number never does.  imapd's MH driver jumps through some hoops to
try to even this out, but it's slow and unreliable.

noting that slocal already links to either db, ndbm, or gdbm (according to
portability logic in the Makefile) i've thought of adding a per-folder index
(so, ~/Mail/inbox/.mh.db or similar) that maps these number sequences to
each other.  every nmh "sbr" or "uip" that can alter a folder would as a
side effect update this index.  every "sbr" or "uip" which depends on this
index (which right now is none of them but see below) would be capable of
regenerating the whole index, on the assumption that somebody used "mv" or
"cp" to move messages around.  the idea would be that with no changes needed
in the use of MH commands or of shells like imapd or mh-e that read things
directly without going through the MH commands, an index could be maintained
that would give imapd (or eventually dovecot when i get that far) what it
needs.

i can also imagine storing the full rfc822 header object in this index so
that "scan" and many forms of "pick" can operate at the speed of modern
hardware.  (stat()'ing ten thousand files in a directory has not gotten
faster over the years, whereas dbm_read()'ing 10000 elements has gotten
really quite fast compared to the vax 750 i first used MH on.)

before i continue, i'll pause here and give folks a chance to say "this has
been done before and it never worked out well and the whole idea is now
poisoned" or similar.

cheers.

paul



reply via email to

[Prev in Thread] Current Thread [Next in Thread]