[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Lynx-dev] non-ASCII characters (ISO-8859 or UTF-8?)
From: |
David Rabson |
Subject: |
[Lynx-dev] non-ASCII characters (ISO-8859 or UTF-8?) |
Date: |
Fri, 14 Jul 2006 13:55:31 -0400 (EDT) |
To the experts:
I have not been able to get Lynx (version 2.8.5rel.1 or 2.8.6dev.18)
to display characters from the New-York-Times Web site. I suspect
that these are ISO-8859 characters; however, nothing in the HTML
header indicates the character set. The characters have hex codes
0x92 (apostrophe), 0x93 (left quote), 0x94 (right quote), and 0x97
(em-dash).
Here is a URL to reproduce the problem:
http://www.nytimes.com/2006/07/14/washington/14nsa.html
Search for the word "theyre" [sic]: although there is a 0x92 between
the 'y' and the 'r', neither the search feature nor the display
feature sees it.
I've tried the following steps, without success: explicitly setting
the font (-fn) to one with "ISO-8859" in its name; setting the options
('O' menu) to "use locale-based character set" ON and OFF; setting
the option "display character set" to ISO-8859, to UTF-8, and to
"7-bit approximation;" setting the option "assumed document character
set" to ISO-8859 and to UTF-8; setting the option "raw 8-bit
display" to ON and to OFF; setting the environment variable LC_ALL
to "C", to "en_US.utf8," or clearing it.
Here is a segment from the above-cited URL with the embedded characters:
That to me is not what the FISA court is set up to do, she said. The judges
approve warrants theyre not there to rule on matters of constitutionality.
Thanks for any clues.
D. Rabson
Associate professor
University of South Florida
- [Lynx-dev] non-ASCII characters (ISO-8859 or UTF-8?),
David Rabson <=