lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] lynx un-renders 


From: Thorsten Glaser
Subject: Re: [Lynx-dev] lynx un-renders 
Date: Mon, 28 May 2018 08:57:41 +0000 (UTC)

address@hidden dixit:

>       It shows up rarely.  I can't make sense why.  There are

I can: it’s often produced by Microsoft users who save
in a legacy codepage encoding, then convert from latin1
to Unicode.

Now the codepage 1252 is a superset of latin1. latin1
leaves 0x80‥0x9F for C1 control characters (and latin1
is exactly the first 256 codepoints of Unicode), while
cp1252 assigns stuff like € and “” inside that block.

So, basically, a mild cause of Mojibake. But since C1
control characters have no business of existing inside
an HTML document, I’d parse this to dissolve that, i.e.
as misconverted cp1252, instead.

bye,
//mirabilos
-- 
> emacs als auch vi zum Kotzen finde (joe rules) und pine für den einzig
> bedienbaren textmode-mailclient halte (und ich hab sie alle ausprobiert). ;)
Hallooooo, ich bin der Holger ("Hallo Holger!"), und ich bin ebenfalls
... pine-User, und das auch noch gewohnheitsmäßig ("Oooooooohhh").  [aus dasr]



reply via email to

[Prev in Thread] Current Thread [Next in Thread]