[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] More fun with charset functions

From: Ken Hornstein
Subject: Re: [Nmh-workers] More fun with charset functions
Date: Mon, 12 May 2014 10:54:33 -0400

>> The question I put to all of you: who gets to decide what is "right"?
>The RFCs.  They define interoperability.  :-)

Sure, but that's not really the problem we're talking about here.

>I agree with kre;  from looking at FLOSS MUAs and mail-producing
>libraries, one often gets the impression that they didn't open an RFC,
>or if they did, they backed off at the formal(!) language and decided
>instead to "copy the kid sitting next to them" by poking about their
>inbox for example emails.

Well, it's not that simple.

Let's take the most recent example of the bug in PHPMailer.  Those guys
knew the RFCs, and they responded to that bug very quickly and fixed the
issue.  That was a honest bug.  It's hard to really be critical of that
especially since it's easy to violate the RFCs with nmh (1.6 should be
better in that regard).  Also, I can't really complain about people not
reading the RFCs ... there are a bunch of them and it's easy to miss
the ones you might need to care about (see below).

Eric Gillespie said last month:
>Is nmh primarily trying to help users file bugs in other mail
>programs, or trying to help users deal with their email?  I say
>it's primarily the latter.

Which in my mind is hard to argue with.  In the same email Eric pointed out
that if it was completely hopeless, yes, nmh should give up ... he was just
arguing that we shouldn't be so picky.

>Inspecting a corpus is fine for seeing what's out there, and what other
>MUAs have done when there are options, but not for deciding what's valid

But that wasn't what was going on here.  Here's the original flow:

- I noticed that it's possible for write_charset_8bit() to return US-ASCII.
  That's obviously wrong.
- I looked at other MUAs to see what they did in that case (well, okay, I
  looked at one).  The answer was to generate the value "unknown-8bit" and

Is this right?  Well, maybe ... the RFCs are sort of silent on this issue.

Actually, some careful searching leads me to believe that this might
be an RFC-sanctioned response.  RFC 1428 suggests using unknown-8bit
when the information about the character set is unknown (this was
intended to be used by mail gateways, but it seems that this is used in
these weird corner cases by MUAs when the proper character set cannot
be determined).  It's also defined in the IANA charset registry.  I
wouldn't have known about this had I not looked at mutt; it doesn't
sound like anyone else here knew about it either.  It's not clear what
happens if we get unknown-8bit, though (we should probably put something
in there to deal with that).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]