[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages
From: |
Ken Hornstein |
Subject: |
Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages |
Date: |
Thu, 24 Oct 2013 09:14:01 -0400 |
>The munged character in your fist example looks like it's
>supposed to be c3 bc c3, but instead is 83 c2 bc, if I did
>that right. It takes more than one step to get from here to
>there, such as losing bits and wrong endian?
Actually, I think Joel was trying to say "für", which has the middle
letter as an lowercase "u" with umlaut. That would be U+00FC, which has
a UTF-8 encoding of C3 BC. The characters he sees are Ã, uppercase A
with tilde, U+00C3, and ¼, vulgar fraction one quarter, U+00BC.
C3 is à in ISO-8859-1, and BC is ¼ in ISO-8859-1; something is clearly
interpreting the UTF-8 bytes as ISO-8859-1. But since your locale and
the message are both UTF-8, this doesn't feel like an nmh problem to
me. If you just saw the unencoded quoted-printable, yeah, that would
probably be us. But you're seeing the correct bytes; something in your
display path isn't doing the right thing.
--Ken