Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages

nmh-workers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages

From:	Ken Hornstein
Subject:	Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages
Date:	Thu, 24 Oct 2013 09:14:01 -0400

>The munged character in your fist example looks like it's
>supposed to be c3 bc c3, but instead is 83 c2 bc, if I did
>that right.  It takes more than one step to get from here to
>there, such as losing bits and wrong endian?

Actually, I think Joel was trying to say "für", which has the middle
letter as an lowercase "u" with umlaut.  That would be U+00FC, which has
a UTF-8 encoding of C3 BC.  The characters he sees are Ã, uppercase A
with tilde, U+00C3, and ¼, vulgar fraction one quarter, U+00BC.

C3 is Ã in ISO-8859-1, and BC is ¼ in ISO-8859-1; something is clearly
interpreting the UTF-8 bytes as ISO-8859-1.  But since your locale and
the message are both UTF-8, this doesn't feel like an nmh problem to
me.  If you just saw the unencoded quoted-printable, yeah, that would
probably be us.  But you're seeing the correct bytes; something in your
display path isn't doing the right thing.

--Ken

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages, David Levine, 2013/10/24
- Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages, Ken Hornstein <=

Prev by Date: Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages
Next by Date: Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages
Previous by thread: Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages
Next by thread: [Nmh-workers] charset conversion for windows-1252
Index(es):
- Date
- Thread