[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev Re: msg00798.html (was: 0x2276 handling)

From: Leonid Pauzner
Subject: lynx-dev Re: msg00798.html (was: 0x2276 handling)
Date: Tue, 5 May 1998 04:24:55 +0400 (MSD)

OK, besides native utf-8 + hex HREF=  may use any HTML coding scheme
"for internal use only": client software will convert...
What the user should see on the bottom line in advanced mode?
- The converted result as it will be submitted?
> is actually sent in the request to a server.  The intent is to be able to
> use named or numeric character references in the HTML markup, and have the
> browser do the conversions, because few people could do the utf-8 and then
> hex conversions in their heads when writing HTML (certainly not me :).
HTML 4.0 recommend not only href= encoding as %hexhex(utf-8)
but even the associated text in <a> attribute  8-)
This may be useful for bookmarking, though.

> One could, of course, use the utf-8 + hex converted URLs as the attribute
> values in the first place, and WYSIWYG HTML editors may do that (i.e., when
> the user of the editor indicates non-ASCII characters in URLs which will be

>         That bug report from Poon reflects a lack of understanding about
> the difference between the document charset and the Display Character Set.
> What he describes Lynx as doing appears to be what it should be doing.
> However, I tracked down a URL for the FAQ (would have been nice if he had
> included it the the message):
> and when I tried it with the W32 binary, it did what he thought it should
> do, and is wrong.  The server is returning "Content-Type: text/plain"
> without a charset parameter, so the assumed charset should apply, and
> both the 'o'ptions page and the ShowInfo Page ('=') confirm that I have
> it set as iso-8859-1.  Yet it looks as though CJK multibyte characters
> in the iso-8859-1 control character range are being handled as DosLatinUS
> characters for what is sent to the screen (does that binary use the
> "work with MicoSoft sins" assumption that some of those are Windows
Currently it displays &#nnn from x80-x9F range as WINDOWS-1252 codepoints
(inflicted by FrontPage), but not display it in ALT= :-(
If the document is or assumed as iso-8859-1,
control characters (x80-x9F) ignored sighlently if happend.
you mean they should be assumed as windows-1252 ?

> characters, as the v2.7.2 code did?).  Also, when I set the assumed
> charset to euc-jp or shift_jis (it's not clear which the FAQ is using),
> I get different, but still 8-bit characters.
>                                 Fote

The first paragraph of the above FAQ says the text in shift_jis.
I have no idea how kanji should look like, but I got 8-bit characters
on my cyrillic display (cp866) which obviously wrong.
Than I choose "7 bit approx" display and got a translation of
I don't know what. Let someone from Japanese describe how it should be.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]