[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lynx-dev] List-of-links encoded improperly
From: |
Paul Gilmartin |
Subject: |
Re: [Lynx-dev] List-of-links encoded improperly |
Date: |
Wed, 22 Feb 2017 19:01:10 -0700 |
On 2017-02-22, at 17:38, Thomas Dickey wrote:
> On Wed, Feb 22, 2017 at 10:32:24PM +0200, Dimitrios Semitsoglou-Tsiapos wrote:
>> Greetings Lynx developers and users!
>>
>> I have noticed that in `-dump` mode lynx will percent-encode reserved
>> characters in the "list of links" if `-display_charset=UTF-8` is set (or
>> perhaps any value other than ISO-8859-1). This can cause some URLs to
>> effectively break.
>>
>> Would it perhaps be correct to simply ignore `display_charset` while
>> printing these URLs?
>
> not really - it's generating the file (not passing it on), and is
> using a known encoding.
>
From: https://tools.ietf.org/html/rfc1738
2.2. URL Character Encoding Issues
...
URLs are written only with the graphic printable characters of the
US-ASCII coded character set. The octets 80-FF hexadecimal are not
used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
control characters; these must be encoded.
So non-USASCII UTF-8 characters must be encoded.
-- gil