bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#35507: Gnus mojibakifies UTF-8 text/x-patch attachments from Thunder


From: Paul Eggert
Subject: bug#35507: Gnus mojibakifies UTF-8 text/x-patch attachments from Thunderbird
Date: Wed, 1 May 2019 11:26:35 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

On 5/1/19 10:32 AM, Eli Zaretskii wrote:
> Is text/x-patch a "new media type" or not? 

It's not a registered media type so strictly speaking the RFCs' SHOULD
statements do not apply (and they are SHOULDs not MUSTs so they could be
disregarded for good reason). That being said, the ordinary and usual
intent is for the x- media types to follow these recommendations and my
bug report was filed under that assumption.

> my reading of the RFC is that we should not define
> or expect any defaults, which means this bug is squarely in
> Thunderbird's yard

Ah, sorry, I see that my bug report misstated a point. This particular
patch clearly identifies its own encoding because its header says
"Content-Type: text/plain; charset=UTF-8". (I think Git-generated
patches always specify an encoding unless it's ASCII.) So in this
particular case the RFC's recommendation seems to be respected by the
sender.

Gnus could look for a Content-Type: header in text bodies that do not
specify charsets; this would follow the Internet's robustness principle
better.

> I don't see why we should
> change Gnus in this regard, certainly not unconditionally assuming
> UTF-8.
Gnus is mishandling emails sent from Thunderbird right now, so it would
be a practical benefit for Gnus users if it did a better job of decoding
these admittedly-iffy messages.

These days, UTF-8 is by far the most common encoding specified for
non-ASCII text in email and its popularity is growing, so it's the best
choice for a default if Gnus will have one - certainly better than the
confusing behavior that Robert Pluim observed in his Gnus session.
Gnus's current behavior may have been a good idea in 1996 when RFC 2046
said US-ASCII was the default, but it stopped being a good idea in 2012
when RFC 6657 came out and said that UTF-8 should be the default if
there is a default.

Another possibility is that Gnus could ask the user which encoding to
use when the email headers don't specify one and when the text is not
ASCII; even that would be better than Gnus's current behavior of forcing
US-ASCII and displaying something like "\xe2\x80\x99" when it encounters
a non-ASCII character.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]