nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] nmh internals: full MIME integration


From: Ralph Corderoy
Subject: Re: [Nmh-workers] nmh internals: full MIME integration
Date: Sat, 26 Jul 2014 11:12:07 +0100

Hi Ken,

> Right now a call to the MIME parsing routines end up slurping in the
> whole message, but that's not desirable for a lot of programs (scan,
> pick).  It seems like parsing all of the messages headers is generally
> worthwhile; that (usually) fits within a single stdio buffer, so doing
> extra work there shouldn't be a huge problem.

If we're having lazy evaluation of MIME parts, which is good, can it
also cover the headers?  `pick --list-id <address@hidden>' isn't concerned
with decoding Subject and all those Received headers.  It may not sound
like much, but we have folders with tens of thousands of emails.
get_header() could note minimal details of each header it comes across
whilst searching for the List-ID but not bother too much about their
contents.

Also, http://www.ietf.org/rfc/rfc2919.txt says only one List-ID per
email;  does nmh have knowledge of one-off headers so it can stop
reading headers on the first match?  That pick uses the `--' as pick
doesn't know of List-ID, unlike, say, Subject;  perhaps it needs to know
of more official headers so it can make use of one-off-ness.

Whilst looking at pick's source, I found MHPDEBUG;  I don't think it's
documented but could be useful for those learning pick?  Perhaps it
should be -debug instead?

    $ MHPDEBUG=x pick -from tom -and -lbr --list-id foo -o -sub Foo -rbr .
    AND
    | PATTERN(header) ^from[        ]*:.*tom
    | OR
    | | PATTERN(header) ^list-id[   ]*:.*foo
    | | PATTERN(header) ^subject[   ]*:.*Foo
    pick: no messages match specification
    $

Also-also, access to raw and decoded headers would be nice, e.g. I
sometimes want to find Subjects that have `=?utf-8?' in them.

> The Content struct would be extended to indicate whether or not the
> complete message had been parsed; programs that just needed to examine
> the header would simply parse out those headers in the message.
> Because address parsing is common, we could parse out all of the
> addresses as well during header reading.  We could also maintain a
> list of headers that contain addresses (right now each program has to
> keep that list locally) and make a function/macro to query that.

That's the kind of overhead that would be nice to see done only on
demand.

Cheers, Ralph.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]