lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev lynx and other character sets


From: Leonid Pauzner
Subject: Re: lynx-dev lynx and other character sets
Date: Sat, 3 Jul 1999 00:21:33 +0400 (MSD)

30-Jun-99 08:53 Klaus Weide wrote:
>> I thought such indication was too technical for average lynx user
>> and not very useful in fact (say, I run into japanese text
>> with any european display charset).  Instead, this can be indicated
>> from Info Page: [7bit chars only] / [7bit approximation was used]
>> / [few not recognized characters filtered out]  or so.

> But completely dropping the characters without any indication is
> definitely not the right thing, in my opinion.  Whole sections of
> a document meay go missing.  And sometimes omission of single important
> characters may be just as bad.  The user has no idea that he should
> go to the Info Page in order to find out that he missed something.

> The following is from the HTML 4.0 spec:

> 5.4 Undisplayable characters

>    A user agent may not be able to render all characters in a document
>    meaningfully, for instance, because the user agent lacks a suitable
>    font, a character has a value that may not be expressed in the user
>    agent's internal character encoding, etc.

>    Because there are many different things that may be done in such
>    cases, this document does not prescribe any specific behavior.
>    Depending on the implementation, undisplayable characters may also be
>    handled by the underlying display system and not the application
>    itself. In the absence of more sophisticated behavior, for example
>    tailored to the needs of a particular script or language, we recommend
>    the following behavior for user agents:
>     1. Adopt a clearly visible, but unobtrusive mechanism to alert the
>                ^^^^^^^^^^^^^^^
>        user of missing resources.
>     2. If missing characters are presented using their numeric
>        representation, use the hexadecimal (not decimal) form since this
>        is the form used in character set standards.

>> >From the other hand, this hide a bug:
>> when we switch "\" for source mode we got a different output
>> for few notrecognized 8-bit characters when we uncomment the code
>> you are asking for (have not remember details now).

> Hiding a bug is not a good enough reason.  (Even if I may be responsible
> for it...)
sure.

> I think the current behavior, dropping those characters completely
> without notice, is the worst of possible choices.  For example, if
> you think 'Uxxx' is too bad (I don't, but I can understand disagreement),
> showing a '?' for each missing character would be better than nothing.
> (A Warning message, probably combined with that, would be better.
> But of course it should not appear for each character, maybe only once
> per loaded document.)

Perhaps an Alert message (with a text about overloaded "Raw" feature)
will be a good solution. Both very easy to implement in SGML.c and HTPlain.c
where the code was commented out around version 2.8.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]