[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lynx-dev] 2-9-0: -assume_charset= does not override in-document cha
From: |
Thorsten Glaser |
Subject: |
Re: [Lynx-dev] 2-9-0: -assume_charset= does not override in-document charset= |
Date: |
Wed, 17 Jan 2024 22:13:18 +0000 (UTC) |
Steffen Nurpmeso dixit:
>> But editing a document to get it -dump'ed out correctly, that is no
>> good.
You’d be surprised at the amounts of changes I have to do to some
pages for them to lead to correct results…
Mouse dixit:
>Well...the real offender here is whatever led to serving 8859-* data
>but mislabeling it as UTF-8. I have mixed feelings about making it
Indeed.
>For what it's worth, for me (Canada), fetching https://www.google.com/
>gets me a document with headers including
>
>content-type: text/html; charset=ISO-8859-1
>
>and a <head> including
>
><meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
Ouch!
The specs say that the HTTP headere has precedence, unfortunately.
(I think this even means saving the document from the browser will
mean the browser has to fixup/change that meta tag; it *will* mean
accessing that document with a browser will behave differently from
downloading it with cURL or GNU wget or BSD ftp/fetch then browsing
the naïvely downloaded file.)
bye,
//mirabilos
--
<ch> you introduced a merge commit │<mika> % g rebase -i HEAD^^
<mika> sorry, no idea and rebasing just fscked │<mika> Segmentation
<ch> should have cloned into a clean repo │ fault (core dumped)
<ch> if I rebase that now, it's really ugh │<mika:#grml> wuahhhhhh
Re: [Lynx-dev] 2-9-0: -assume_charset= does not override in-document charset=, Henry, 2024/01/20