[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] mime-aware filtering?

From: Paul Vixie
Subject: Re: [Nmh-workers] mime-aware filtering?
Date: Tue, 26 Jun 2012 23:11:34 +0000
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120614 Thunderbird/13.0.1

On 2012-06-26 3:02 AM, Ken Hornstein wrote:
>> int m_getfld (int state, unsigned char *name, unsigned char *buf, int
>> bufsz, FILE *iob) 
> Okay ... just shooting from the hip, and based on our discussion back
> in January ... here's something (I'm ignoring how this would be
> implemented for now, and I'm not defining any of the structures). I
> hope these functions would be obvious in operation.

this is a good start, assuming that the places which currently use
m_getfld() could be mollified by it.

> int nmh_openmsg(struct message, messagehandle *, char **error); int
> nmh_getheader(messagehandle, const char *, char **header, int
> *numheaders, char **error); int nmh_getmime(messagehandle,
> mimehandle_ret *, char **error); int nmh_openmime(mimehandle, char
> **type, char **subtype, int *nested, mimehandle_ret *, char **error);
> int nmh_nextmime(mimehandle, char **type, char **subtype, int
> *iterator, char **error); int nmh_closemime(mimehandle); int
> nmh_closemsg(message); I'm sure there are problems with this, just
> wanted to get the ball rolling.

i'm ignoring stylistic quirks, for example, i'd return an "struct
message *" from the open function, and it would contain function
pointers to the "methods" of the "object".

i'm ignoring correctness concerns, like how do the objects inside "char
**x; int *y" get freed.

i'm ignoring naming concerns, whereby i think that "nmh_" is the wrong
prefix for these, since they could be used for any message that's in a
disk file, even if its repository was Maildir.

focusing just on the problem statement and solution shape:

a message has a header, zero or more child parts, and may have a body.

a part has a header, zero or more child parts, and may have a body.

therefore a message is really just a special case of a part, having no
parent object.

a header may specify a mime type, mime version, and/or encoding. as well
as subject:, et al.

if we want object recursion to be done by the caller and not by some
function that uses callbacks, we're in hell since most interesting mime
messages are deep.

we'd like to be able to parse in one pass, put all content (decoded) in
the file system not on the heap, and never have to remember more than
where are in terms of object depth. that is, our stack or heap would
only know what object we were looking at, and who its ancestors are. we
would not try to represent the full message in RAM or even the full
message structure in RAM.


typedef struct mime_part *mime_part_t;

mime_part_t mime_fopen(const char *filename, const char *filemode);
mime_part_t mime_fdopen(int fileno, int mode);
void        mime_rewind(mime_part_t);
bool        mime_hasbody(mime_part_t);
size_t      mime_bodyread(mime_part_t, u_char *, size_t);   /* 0 means
EOF */
char *      mime_bodygets(mime_part_t, char *, size_t);     /* NULL
means EOF */
bool        mime_hasparts(mime_part_t);
mime_part_t mime_nextpart(mime_part_t);
void        mime_dispose(mime_part_t);


this assumes that every iterator will keep a linked list of ancestors
while tree walking -- something that used callbacks would be just as
difficult but in a different way. it makes no provision for writing MIME
objects, and does not show how one retrieves the content type or mime
version or any other header.

it's otherwise patterned after "MIME::Parser(3) -- User Contributed Perl


reply via email to

[Prev in Thread] Current Thread [Next in Thread]