lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] 2-9-0: -assume_charset= does not override in-document cha


From: Thorsten Glaser
Subject: Re: [Lynx-dev] 2-9-0: -assume_charset= does not override in-document charset=
Date: Wed, 17 Jan 2024 22:13:18 +0000 (UTC)

Steffen Nurpmeso dixit:

>> But editing a document to get it -dump'ed out correctly, that is no
>> good.

You’d be surprised at the amounts of changes I have to do to some
pages for them to lead to correct results…

Mouse dixit:

>Well...the real offender here is whatever led to serving 8859-* data
>but mislabeling it as UTF-8.  I have mixed feelings about making it

Indeed.

>For what it's worth, for me (Canada), fetching https://www.google.com/
>gets me a document with headers including
>
>content-type: text/html; charset=ISO-8859-1
>
>and a <head> including
>
><meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

Ouch!

The specs say that the HTTP headere has precedence, unfortunately.
(I think this even means saving the document from the browser will
mean the browser has to fixup/change that meta tag; it *will* mean
accessing that document with a browser will behave differently from
downloading it with cURL or GNU wget or BSD ftp/fetch then browsing
the naïvely downloaded file.)

bye,
//mirabilos
-- 
<ch> you introduced a merge commit        │<mika> % g rebase -i HEAD^^
<mika> sorry, no idea and rebasing just fscked │<mika> Segmentation
<ch> should have cloned into a clean repo      │  fault (core dumped)
<ch> if I rebase that now, it's really ugh     │<mika:#grml> wuahhhhhh



reply via email to

[Prev in Thread] Current Thread [Next in Thread]