[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Tweaking HTML.c to insert characters (was: UTF-8 display qu

From: Sergei Pokrovsky
Subject: Re: lynx-dev Tweaking HTML.c to insert characters (was: UTF-8 display questions)
Date: 11 Jun 2000 16:02:08 +0700
User-agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.6

>>>>> "Klaus" == Klaus Weide <address@hidden> writes:


  Klaus> On 9 Jun 2000, Sergei Pokrovsky wrote:

  >> Well, your piece works well if "display character set" is set to
  >> ASCII; it ASCIIzes as expected.  But it fails in the trivial
  >> case, when "display character set" is UTF-8; then the fallback
  >> branch is taken :)

  >> I normally have "UTF-8" for the display character set in my cfg;
  >> but it doesn't work.

  Klaus> I hope you have seen my followup message on that by now, with
  Klaus> the correction; and I hope that *that* will work as
  Klaus> expected. :) Sorry for the non-working code.

It seems that the increased Solar activity caused some mail lossage :)
anyway, I've taken the corrected version from the archive and it
really does work.  Thanks.


  >>  BTW, the usual ASCIIzation of the esperantic letters in Latin-3
  >> (and Unicode) is by adding x for the hat or breve;

  Klaus> Is this the only scheme in use?  I remember vaguely that I
  Klaus> read about some alternatives, something using 'h'.

That's right.  Using "h" is Official, or "Fundamental", because that
escape is offered in the Holy Scripture (Fundamento de Esperanto).
Now, there are different classes of people:

1) those having an engineering turn of mind naturally prefer the
   x-convention, which offers many advantages (it is biunique, and
   thus suitable for automatic conversion; it gives good sorting with
   usual ASCII-oriented functions);

2) those educated in humanities usually are shocked with such a
   non-traditional use of x, and argue that it damages the image of
   Esperanto in the eyes of the unaware public.

About 7 years ago most of the Esperanto articles in
soc.culture.esperanto used the x-convention (I guess, about 80%);
since then the use of accented characters in computers have become
much easier and better standardized, which paradoxically caused that
the class (1) have migrated to Unicode and no longer use any ASCII
ersatz -- and thus, the proportion seems to have reversed.  But in
some cases where precision is required, the x-convention remains
unbeatable, e.g. when you have to submit a search word to a HTML form.


  Klaus> Much of the existing def7_uni.tbl is from me, and it wasn't
  Klaus> meant to be the definite transliteration.  A lot of it is ad
  Klaus> hoc and can be improved; it's just that not many people have
  Klaus> shown interest.  If you make these changes, and think they
  Klaus> are of general use, please send patches.

Well, I attach a diff file.

  Klaus> There is a potential problem, in that those strings are not
  Klaus> language- or locale-specific.  So ŝ -> sx may be right for
  Klaus> Esperanto, but not for some other language that also uses
  Klaus> that character.

No, no other language uses the hat-accented consonants of Esperanto.
OTOH, "the short u" (ŭ) could be found in transcritions of the classic
Latin texts (though I've never seen such a thing in WWW).

  Klaus> (Maybe ŝ -> sh or whatever is the alternative is better?)

Oh, that's a religious matter :)  Yes, that woud produce "shipo" for
"ŝipo" (ship) and "sharko" for "ŝarko" (shark); but that would be
almost as imperfect as the English writing for "dishaki", "flughaveno"
etc (like in the English "dishonor", "mishap" etc, where "sh" happen
to be distinct letters, not a digraph).


Attachment: x.diff
Description: x.diff

reply via email to

[Prev in Thread] Current Thread [Next in Thread]