emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

missing charset for non-ASCII text/x-patch MIME parts in Thunderbird


From: Stephen J. Turnbull
Subject: missing charset for non-ASCII text/x-patch MIME parts in Thunderbird
Date: Thu, 14 May 2015 17:28:43 +0900

Ivan Shmakov writes:

 >      As I’ve pointed earlier [1], Thunderbird (on the /sending/ side)
 >      for some reason chooses /not/ to file the ‘charset’

File a bug on Thunderbird, then.  Absence of a charset parameter means
charset=US-ASCII, and Thunderbird should not be emitting US-ASCII MIME
parts with non-ASCII characters present.  Not even if the MTAs agree
to use SMTP8.

 >      In the absence of the explicitly-stated encoding, the
 >      receiving side may resort to guessing,

A conformant receiver SHOULD NOT guess, unless the user has given it
explicit permission to do that (of course, then anything is OK).  From
RFC 2046:

   4.1.2.  Charset Parameter

   A critical parameter that may be specified in the Content-Type
   field for "text/plain" data is the character set.  This is
   specified with a "charset" parameter, as in:

     Content-type: text/plain; charset=iso-8859-1

   Unlike some other parameter values, the values of the charset
   parameter are NOT case sensitive.  The default character set, which
   must be assumed in the absence of a charset parameter, is US-ASCII.

Note that technically speaking the MUST in this section only applies
to text/plain, and not to any other text content-type.  However, given
that the section says

   The specification for any future subtypes of "text" must specify
   whether or not they will also utilize a "charset" parameter, and
   may possibly restrict its values as well.  For other subtypes of
   "text" than "text/plain", the semantics of the "charset" parameter
   should be defined to be identical to those specified here for
   "text/plain", i.e., the body consists entirely of characters in the
   given charset.

Pretty clearly the intent is that the behavior of text/plain is to be
the default for other text content-types, unless *explicitly* stated
in the content-type spec.  See also section

   4.1.4.  Unrecognized Subtypes

   Unrecognized subtypes of "text" should be treated as subtype
   "plain" as long as the MIME implementation knows how to handle the
   charset.

This only makes sense when charset is unspecified if charset is
assumed to be US-ASCII.

 >      I presume this issue (the one of /not/ including the ‘charset’)
 >      is specific to Thunderbird.  As an example, please look at a
 >      fragment of the original patch thus MIMEd from Gnus.

File a bug on Gnus, too. :-)

Of course Emacs should do what its user asks, but the default should
be to assume US-ASCII if there is no charset parameter, and to bitch
(not guess) if non-ASCII octets are seen.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]