[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] indexing

From: Paul Vixie
Subject: Re: [Nmh-workers] indexing
Date: Sun, 06 Feb 2011 03:30:01 +0000

> From: address@hidden
> Date: Sat, 05 Feb 2011 20:22:35 -0500
> I've seen this idea several times, and I always have the same question
> - how would we deal with index/cache synchronization?  One of the
> reasons I'm still using MH/exmh is because the one message per file
> paradigm means that you can do interesting things with regular Unix
> commands - except if you screw up and use /bin/mv or /bin/rm rather
> than refile and rmm, you end up with the index no longer matching
> reality.

right now i have an index that can be kept up to date with existing hooks
and some small outboard programs, and i have a couple of scripts that can
rebuild the index for a single folder or for all folders.  i have not yet
built the incremental rebuild/checker that only cleans up after "rm" and
"mv" but i agree that one is needed and it must be Really Fast for folders
that are only slightly out of synch.

anything that looks at the index will have to check for index freshness
and be ready to call the incremental rebuild/checker as nec'y.

> This may be easier to deal with on recent Linux kernels, where you can use
> stuff like the inotify facility and leave a process running to catch such
> activity and clean up the cache.  But that's hell on portability....

not only hell on portability also a lot of moving parts and likely to be
unreliable.  and, "not the MH way."


> Date: Sat, 5 Feb 2011 18:04:14 -0800 (PST)
> From: Lyndon Nerenberg <address@hidden>
> If you're willing to live with "almost 100% accurate" you can go a
> long way just by comparing the index/cache file mod times with the
> directory mod time.

i think we can get all the way to 100% accuracy by looking at the ctime
of the directory and the mtime of the index and doing incremental fixage
before accessing the index.  this should be pretty rare.

> If you consider the messages to be immutable (which they are, with the
> exception of anno mucking about adding headers) the only thing that's
> really going to put you out of sync is if something renumbers the
> files in the folder.

on the topic of 'anno', the IMAP protocol thinks that headers are immutable,
so much so that if they are changed then a new UID must be assigned.  i
think this means that a correct IMAP server must elide the 'anno' headers
but i havn't got that far yet.

> And since the only way that's likely to happen is with pack, the
> index/cache would get updated with the new file names.

well, also sortm, but your point is valid.

> And as with the existing sequences imlementation, the only way you get
> 100% consistency is by making the message store a black box, at which
> point it's no longer MH.

i surely do like MH.  one day in... 1986? i was visiting jordan hubbard
when he lived in the oakland hills and his girlfriend at that time (kim
manton) noted that i was using ucbmail as my primary mailer and she
said, and i'll never forget this, "why aren't you using MH? everybody
who's anybody uses MH."  i'd never heard of MH, but i tried it and never
stopped.  most people i knew who used MH have stopped, and are now happy
gmail web clients or outlook or Mail.app IMAP clients, but i can't do it,
i need 'pick' and 'refile'.

i just need MH to be faster for 10GByte mail stores than it is, and i
need it to be reliable through an IMAP interface.  i see no reason why
i should ever delete e-mail, 10GByte is small by today's standards.  but
i know i won't be able to keep open()'ing every file to see what's going
on unless i'm willing to run two MH stores and put everything older than
five years into the one i never look at.  which "would no longer be MH."

reply via email to

[Prev in Thread] Current Thread [Next in Thread]