[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] mime-aware filtering?

From: Paul Vixie
Subject: Re: [Nmh-workers] mime-aware filtering?
Date: Tue, 26 Jun 2012 03:08:23 +0000
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120614 Thunderbird/13.0.1

On 2012-06-26 2:50 AM, Jon Steinhart wrote:
> Paul Vixie writes:
>> ...
>> int
>> m_getfld (int state, unsigned char *name, unsigned char *buf,
>>           int bufsz, FILE *iob)
>> your move.
> OK, well, I understand your point of view here but I really don't think
> that my point of view is really different.  As far as I can tell (once
> I get past the dire warnings), the m_getfld looks for stuff in a mail
> message and stops once it gets what it needs.  It was designed in the
> age before MIME, so its notion of what constituted headers was limited.

not just that. its idea of what content is, is limited. its callers
generally expect fully decoded text, but it's perfectly capable of
returning quoted-printable or base64. the caller has currently got the
responsibility to know that it's encoded and to know how to decode it.
unsurprisingly, most parts of MH don't do this. so, to decode something
you use a special version of the command (mhshow vs. show, for example).
this is wrong, and it isn't working.

to revise the API we have to figure out what the callers need, yes, but
also what the callers should be forced to do differently. i think we're
going to have to start from an idealized environment and work backward
to the practical.

> Now, MIME did many things that maybe should have been kept separate in
> hindsight, but one of them was to extend the definition of headers.  So,
> I'm proposing that m_getfld be extended so that it finds these "extended"
> headers.  I'm not presently suggesting that it be extended to be able to
> decode the multiple body parts that MIME squeezes into the old definition
> of body.

i don't think you can have A and !A at the same time. either callers of
m_getfld() will continue to believe that there is just one set of
headers and that iteration through a message consists of repeated calls
to m_getfld(), or else (and this is what i think has to happen) these
callers are going to have to become MIME tolerant (note: this is not the
same as MIME aware) and that iteration consists of repeated calls to...
something... that gives it a header/body object, which might require
recursion back through itself if the object in question contains other
header/body objects rather than just a body.

at a high level, how do people feel about callbacks vs. state blobs?
that is, would we like the replacement for m_getfld() to continue to
return each time it finds something, maintaining its state in a
caller-supplied opaque state blob, or would we like it to call the
caller's "work function" every time it discovers a new object?

that's the level we have to plan at, if we're going to get MH out of the
1980's. (where it totally ruled, btw.)

> As I said in an email years ago, I'd be happy to be able to have scan
> optionally do something like this:
> 1695+ 06/26 Paul Vixie         Re: [Nmh-workers] mime-aware filtering?<<On
> 1695.1                         image/png name="foo"
> 1695.2                         application/pdf

i agree with this vision.

> It would be nice to be able to decode the body parts to flesh out the part
> subject lines but even without that it would be a huge improvement.

i can't think of a non-fundamental but still on-the-right-track rewrite
that would give *only* the above.

> I realize that this could all be done by hacking a script around mhlist
> but that is really ugly.


> Biggest internal structural change that I can think of is that we might
> want some array of fields indexed by part number, or a tree of fields.

nothing in MH currently requires the entire message, or a map of it, to
fit in memory. i think we should preserve that.

let's talk iteration and access of messages and parts and subparts.
we'll assume for now that folders are still just what they are and that
there's no need to change how we access or iterate through them. (though
there is such a need, i think we can disconnect it from this discussion
and proceed independently for now.)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]