[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] nmh internals: full MIME integration

From: Ken Hornstein
Subject: Re: [Nmh-workers] nmh internals: full MIME integration
Date: Wed, 30 Jul 2014 11:16:24 -0400

>Mostly, but there's more duff emails coming along from the
>base64-everything brigade, even though the RFCs say to use it minimally

Well, technically that's not what they say, assuming we're talking about
headers (RFC 2047).  ยง5 says:

   Use of 'encoded-word's to represent strings of purely ASCII
   characters is allowed, but discouraged.  In rare cases it may be
   necessary to encode ordinary text that looks like an 'encoded-word'.

>Would be much nicer if it travelled internally as UTF-8 and then popped
>into the US-ASCII draft as
>    Subject: Re: =?utf-8?B?wqE=?=Hola world!

Other people have said that to me after I wrote the RFC 2047 encoder.  My
reply is: "write an RFC 2047 encoder and get back to me".

It's actually technically challenging.  For example, spaces between
encoded-words are collapsed and not output, but spaces between
encoded-words and regular atoms are NOT collapsed.  So to encode a
header "minimally" would mean we'd need to parse the header on a
per-atom basis, understand when we're dealing with a 'text' vs. a
'phrase' (there are different rules) and keep track if the last atom was
encoded or not to know when we need to encode spaces into an encoded word.
And should we encode spaces in the previous atom, or the next?  What if
there are multiple spaces?

Okay, none of these are impossible problems; they are solvable.  It's
just ... well, the RFC 2047 encoder is complicated.  More complicated
than I would prefer; I believe it's at the minimum for what we need to
implement.  This would make it MUCH more complicated.  And for what,
exactly?  Pretty much every other MUA silently handles these headers;
users don't see them.  You see them in nmh if you haven't updated your
format files, or you're working on the raw message.  Otherwise they
should mostly be invisible.  Yes, they'll show up for replies; you can
choose to decode them in the reply draft if you want.  But what you're
asking for is a huge amount of work that would only benefit people who
are running in a US-ASCII locale ... and honestly, I'm not really
sympathetic to those people at this point.

>Ah, OK, I was thinking we were after an interface that allowed more
>optimal behaviour in the future rather than one that was only satisfying
>today's users of the existing interface.

In theory, it might be more optimal.  Like I said, I'm not convinced.
The previous authors of pick(1) didn't seem to think that doing extra
system calls to get back previously read data was a huge loss.  I have
to believe that reading in the whole header to make it searchable would
not be a performance impact.  I could see structuring the API to maybe
make the lazy header reading possible as well ... but that would be a
future optimization.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]