[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lynx-dev] lynx un-renders 
From: |
Thorsten Glaser |
Subject: |
Re: [Lynx-dev] lynx un-renders  |
Date: |
Mon, 28 May 2018 08:57:41 +0000 (UTC) |
address@hidden dixit:
> It shows up rarely. I can't make sense why. There are
I can: it’s often produced by Microsoft users who save
in a legacy codepage encoding, then convert from latin1
to Unicode.
Now the codepage 1252 is a superset of latin1. latin1
leaves 0x80‥0x9F for C1 control characters (and latin1
is exactly the first 256 codepoints of Unicode), while
cp1252 assigns stuff like € and “” inside that block.
So, basically, a mild cause of Mojibake. But since C1
control characters have no business of existing inside
an HTML document, I’d parse this to dissolve that, i.e.
as misconverted cp1252, instead.
bye,
//mirabilos
--
> emacs als auch vi zum Kotzen finde (joe rules) und pine für den einzig
> bedienbaren textmode-mailclient halte (und ich hab sie alle ausprobiert). ;)
Hallooooo, ich bin der Holger ("Hallo Holger!"), und ich bin ebenfalls
... pine-User, und das auch noch gewohnheitsmäßig ("Oooooooohhh"). [aus dasr]