[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] Superfluous  's

From: Ken Hornstein
Subject: Re: [Nmh-workers] Superfluous  's
Date: Mon, 27 Oct 2014 13:08:51 -0400

>I've given up trying to understand MIME well enough to participate in
>discussions of the future of nmh, but I haven't given up using nmh, and
>understanding MIME well enough to be an nmh user.

Alright, fair enough.  I alluded to getting it wrong in my first reply,
but here's the complete issue.

The message contains (the offending line, as part of the HTML part):

=C2=A0 =C2=A0 Norman Shapiro<br>

This part is tagged as:

Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

So those characters, decoded from quoted-printable, end up as U+00A0,
"NO-BREAK SPACE".  More information on that is here:


So why does Chrome get this wrong?  Well, the HTML in question does
NOT contain a defintion of the character set to be used for this text.
That's allowed by the (MIME) standards, and a lot of text/html content
does not; they just use the charset parameter in the Content-Type
header.  But Chrome doesn't know about the "charset" parameter in the
Content-Type header, because it never sees it; all it sees is the
HTML content.  The relevant web standards either specify Windows-1252
or ISO-8859-1 as the default character set (I forget which one is
"correct", but it doesn't really matter here).

What Chrome sees is the bytes 0xC2 and 0xA0.  0xC2 gets interpreted as
Â, and 0xA0 gets interpreted as the non-breaking space.

What's the solution?  Well, if you look at what we do for text web
browsers like w3m, we change the default character set to the value
specified by the charset parameter.  It does not seem like Chrome
supports that.  You could prepend a default character set to the
HTML, but that seems like it would be fraught with problems.  So I'm
unsure of how to deal with this problem.

We actually covered this same question back in August:



reply via email to

[Prev in Thread] Current Thread [Next in Thread]