[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] Weird behavior with non-ascii code in headers

From: Ken Hornstein
Subject: Re: [Nmh-workers] Weird behavior with non-ascii code in headers
Date: Wed, 26 Jun 2013 22:06:57 -0400

>Well, my thought is to present errors to the user for manual
>intervention.  After all, if a person is smart enough to use nmh,
>they're smart enough to figure out how to fix a header line, right?  :D

I know you were just kidding, but I am just trying to figure out if
that makes sense.  For one thing, what does that do for people who deal
with some of the nmh front-ends?

>Off the cuff, I'd guess that there would be three cases, 1)
>envelope-related headers such as From, To, Return-Path, Subject,
>Message-ID and Date, 2) content-related headers like Content-Type and
>Content-Transfer-Encoding, and 3) delivery/other information headers
>like Received and X-*.  (I'm sure there are more appropriate terms for
>these categories, but I can't think of them now.)  We'd need to validate
>that From, To and Return-Path are addresses and that Subject and
>Message-ID are appropriate strings.  I think that Date would probably be
>the hardest envelope-related header to automatically correct.

So, it occurs to me there is already some smarts in the format engine
regarding these things.  E.g, an invalid Date: header generally gets
handled ok.  And in fact if there's a address that can't be parsed,
that's handled fine as well.  The problem here really more of a bug in
the I18N handling _at the output stage_, and is really kinda obscure;
the issue in my mind is: how do we deal with that?  This also comes up
with other headers, like Subject; I expect the same thing would happen
there.  Part of me thinks that if we run into a non-ASCII character in
a header in the format engine we should simply replace it with a "?".

>Validating content-related headers would be interesting.  If we validate
>the header lines themselves, shouldn't we make sure that they represent
>the actual content?  (Cue MIME discussion.)  As for the other headers,
>wouldn't we just have to ensure that they're appropriately encoded
>strings?  Maybe prepend "X-Malformed-Header:" to a line we couldn't
>automatically fix?  And of course, we should validate continuations for
>all lines.  I'll look into it when I get a chance, but I find it
>fascinating that scan couldn't figure out the proper date or subject
>when it ran into invalid continuations.

Sigh.  You can read the discussion in the archives about auto-fixing of
malformed messages, but it seemed to me the rough consensus was that
it should NOT happen automatically.  Also, if you can figure out how to
do that on a reliable basis, you're smarter than I.

>> I'm looking at inc.c and I'm not seeing the code you mention; can you point
>> me to it?
>Part of a commented out function called cpymsg near line 980.  Not much
>there, actually...  (Ah, wait, I was using the 1.5 release tarball
>source.  I just cloned the git repository and that part is gone.)

Ok, I went and looked at that ... that was NOT about fixing a "From:"
header, it was about fixing a "From " header ... in other words, the
mailbox message separator.

>> I have not yet seen a message/global MIME type in the wild.  When we start
>> seeing them I think we should care.  Are people seeing these messages?
>I'm not actually seeing anything like this yet, but I'm always
>interested in trying to take the future into account when planning

Well, I think we're still working on bringing nmh into the early 2000s :-)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]