[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] non-ascii characters in URLs

From: Klaus-Peter Wegge
Subject: Re: [Lynx-dev] non-ascii characters in URLs
Date: Tue, 3 Jan 2012 12:00:02 +0100 (CET)

Lynx Team,

to demonstrate the problem and to avoid long discussion on
charactersets I attach a small sample html-file.
The w3c validator says this file is valid html.

Please open this file in any browser and try the 4 links.
They will work and the respective sites are browsable.

Please open the attached file in lynx and try the 4 links again.
The first and the second do not work, because lynx is not
able to create the correct request to the server.
You can observe this in the bootm line of the lynx screen,
where the current selected URL is shown, and you will see
the bad formatted requests also in the trace file.

Kind regards, Klaus

On Mon, 2 Jan 2012, Thomas Dickey wrote:

On Mon, Jan 02, 2012 at 08:21:39PM +0100, Klaus-Peter Wegge wrote:
Dear Lynx Team,

URLs with non-ascii characers are getting more popular in Europe.
For example: http://www.hö (German site).

well... other users would be set up for using UTF-8 encoding.
But that isn't quite the problem.  After running some checks
to refresh my memory:

a) from the command-line, lynx accepts just URLs that contain ISO-8859-1.
b) in webpages, lynx accepts more flavors.
c) lynx displays those URLs using translation from the IDNA library
  (which is helpful in some ways, not in others).

Actually (for your test url), in Firefox I get a page saying "no pages
found".  I can get the same result by following a suitable link from
lynx.  Attaching files that I used for experimenting with lynx, for
discussion purposes...

lynx releases (2.8.6 and 2.8.7) cannot deal with such URLs on my suse or
debian mashines, while various grafic browsers and even LINKS on the
same systems can handel such URLs. More precisely:
lynx cannot connect the remote host for this URLs.
The effect is the same when specifiying the URL on the command line, in
the goto-command or when the URL is a link in a HTML document.
I also tryed to map the character "ö" by ö . No effect.
Is this a problem of setting the characterset (I use ISO Latin 1,
iso-8859-1) or LOCALE?

Can someone from the developper team try to visit the example site,
please? Do you have the same problem with the current lynx-dev version?

Thanks for your help



Lynx-dev mailing list

Thomas E. Dickey <address@hidden>

Dipl.-Inform. Klaus-Peter Wegge
Siemens AG
Corporate Technology, CT T DE ACC
Accessibility Competence Center
Fürstenallee 11
33102 Paderborn, Germany
Tel:  +49 5251 60-6144
Fax:  +49 5251 60-6139
Mob:  +49 173 7019577
Mail: address@hidden, address@hidden

Siemens Aktiengesellschaft:
Vorsitzender des Aufsichtsrats: Gerhard Cromme;
Vorstand: Peter Löscher, Vorsitzender; Roland Busch, Brigitte Ederer,
  Klaus Helmrich, Joe Kaeser, Barbara Kux, Hermann Requardt,
  Siegfried Russwurm, Peter Y. Solmssen, Michael Süß;
Sitz der Gesellschaft: Berlin und München, Deutschland;
Registergericht: Berlin Charlottenburg, HRB 12300,
  München, HRB 6684; WEEE-Reg.-Nr. DE 23691322
  1. Hörkomm
  2. Hörkomm
  3. Google
  4. c't - Magazin für computertechnik

reply via email to

[Prev in Thread] Current Thread [Next in Thread]