Re: [Nmh-workers] Thoughts: header/address parsing

From: Ralph Corderoy
Subject: Re: [Nmh-workers] Thoughts: header/address parsing
Date: Sun, 03 Aug 2014 11:37:30 +0100

Hi Ken,

> If the message-id header doesn't match the RFC 5322 syntax, should we
> care?  I say no.

Nurse!  My pills!

Seriously, I was thinking of writing something to verify the minutia
separate from nmh so if nmh ducks the issue that's fine by me.

> I remember people saying that they had a list of messages that nmh
> dealt poorly with; it would be nice to try those out against a
> hypothetically-new nmh parser.

I've been wondering about a corpus of emails for testing programs.  The
only public one I've found is https://en.wikipedia.org/wiki/Enron_Corpus
which is going to be recent-era emails.  Perhaps if folk on the list
have old emails that were from public mailing lists at the time, we
could build a bit of a collection for testing purposes?  I guess
anonymising private ones would be a can of worms and weaken their worth
anyway?  Perhaps a semi-private collection of those could be collated
though, passed by hand to a known interested party?

Cheers, Ralph.

