lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV Hebrew/Arabic Bidirectional issues with lynx


From: David Woolley
Subject: Re: LYNX-DEV Hebrew/Arabic Bidirectional issues with lynx
Date: Sun, 7 Sep 1997 15:40:09 +0100 (BST)

Foteos wrote:
> particularly because essentially all of RFC 2070 has been incorporated
> into the W3C's so-called HTML 4.0, and since Netscape and MSIE have de

> iso-8859-1 as of RFC 2070 and so-called HTML 4.0 :).  The page has
> 8-bit/multibyte characters only as link names, none as attribute values,
> and no fancy BIDI markup, so in theory with the Lynx chartrans support

My copy of HTML 4.0 is in the office, so I'm not sure what line you are
taking here.  Are you saying that Microsoft and Netscape are forcing a
physical markup solution to bidirectional texts, in which case, maybe
it is time to take the wheel full circle and invent a simple logical
markup language for distributed hypertext, which lacks the complexity
of current proprietory word processing formats.

If, on the other hand, the only issue is the move to Unicode, my only
reservation would be that the existing default should be retained and
UTF-8 always be explicitly selected.  I certainly don't like the use
of META, except if the server is known to copy it into the headers - I
gather that some versions of Netscape don't like it either and refetch
8859/1 documents because FrontPage has added a redundant META Content-Type
line into the HTML HEAD section.

It is possibly worth noting that, for most Indic languages, the Unicode
characters can't be treated as glyphs, so a genuinely multi-lingual 
browser cannot be implemented without a lot of semantic knowledge 
embedded in the code.  (I think consonant vowel pairs are taught to
children as ligatures, and although you can, in reality, compose most of
them by simple algorithms, there are a lot of (consonant)+ groupings
which would look uneducated if done trivially - there is an implicit
vowel in all the consonent characters and a special mark is needed to
cancel it, however a ligature with the next consonant is almost always
used instead of cancelling the vowel on each consonant.  If you are
algorithmically composing ligatures, you have to be aware that the glyph
for some following vowels actually precedes the glyph for the consonant.)

HTML 4.0 does appear to have some good features, such as the relegation
of <B> etc. to deprecated status.  However, I can't really see the mass
market abandoning physical markup whether by direct physical markup or 
by "careful" choice of logical markup.  I don't know of anywhere that uses
MSWord styles to the full (I double space pararaphs because using a style
with extra spacing would probably confuse the next person to edit the
document) and a senior member of our management is reputed to synthesize
hard line breaks by tabbing around the right margin!
;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]