lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Lynx-dev] non-ASCII characters (ISO-8859 or UTF-8?)


From: David Rabson
Subject: [Lynx-dev] non-ASCII characters (ISO-8859 or UTF-8?)
Date: Fri, 14 Jul 2006 13:55:31 -0400 (EDT)

To the experts:

I have not been able to get Lynx (version 2.8.5rel.1 or 2.8.6dev.18)
to display characters from the New-York-Times Web site.  I suspect
that these are ISO-8859 characters; however, nothing in the HTML
header indicates the character set.  The characters have hex codes
0x92 (apostrophe), 0x93 (left quote), 0x94 (right quote), and 0x97
(em-dash).

Here is a URL to reproduce the problem:
        http://www.nytimes.com/2006/07/14/washington/14nsa.html
Search for the word "theyre" [sic]: although there is a 0x92 between
the 'y' and the 'r', neither the search feature nor the display
feature sees it.

I've tried the following steps, without success: explicitly setting
the font (-fn) to one with "ISO-8859" in its name; setting the options
('O' menu) to "use locale-based character set" ON and OFF; setting
the option "display character set" to ISO-8859, to UTF-8, and to
"7-bit approximation;" setting the option "assumed document character
set" to ISO-8859 and to UTF-8; setting the option "raw 8-bit
display" to ON and to OFF; setting the environment variable LC_ALL
to "C", to "en_US.utf8," or clearing it.

Here is a segment from the above-cited URL with the embedded characters:

“That to me is not what the FISA court is set up to do,” she said. “The judges 
approve warrants — they’re not there to rule on matters of constitutionality.” 


Thanks for any clues.

D. Rabson
Associate professor
University of South Florida




reply via email to

[Prev in Thread] Current Thread [Next in Thread]