[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Chinese characters support

From: Lee Sau Dan
Subject: Re: Chinese characters support
Date: 14 May 2003 08:14:14 +0200
User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7

>>>>> "Charles" == Charles Muller <address@hidden> writes:

    Charles> One more time:

    Charles> Since the HELLO file is used for internal testing by
    Charles> Emacs coders it almost always works correctly in any
    Charles> recent Emacs "out of the box."

No.  If you  have problems with the font  installation (esp. when none
of  your font servers  offer the  relevant fonts  or your  sys. admin.
simply don't  care about your non-English needs),  HELLO won't display
the glyphs.  It only display boxes there.

    Charles> The common misunderstanding occurs when people who are
    Charles> trying to get CJK working in utf-8 write to this, or
    Charles> another list for help, and list members, in the spirit of
    Charles> trying to be helpful, suggest that all is fine if the
    Charles> HELLO file displays right.

For utf-8 testing, I'd refer someone  to the test files in the MuleUCS

    Charles> Since the people who usually make the suggestion to test
    Charles> via the HELLO are those who do not regularly use CJK, it
    Charles> seems that they are not aware of this discrepancy, and I
    Charles> wanted to point this out.

No.  Those people often use CJK regularly.  They just don't use utf-8.
Like me (using Big5), they use a national encoding (e.g. GB2312, JIS,

    Charles> It seems strange to see people react so emotionally to
    Charles> the exposure of this simple point. No one is asking that
    Charles> the hallowed HELLO file be sent to oblivion--although a
    Charles> reincarnation as utf-8 would certainly not hurt! :-)

That WILL  certainly HURT.  Look carefully at  the section "Difference
among chinese characters  in GB, JIS, KSC, BIG5:"  in HELLO.  The same
thing cannot  be reproduced in vanilla utf-8,  because Unicode unifies
the various characters  in these encoding into one  single code point.
(Most  efforts in  the earlier  versions  of Unicode  were devoted  to
_unifying_  characters from  different languages,  employing different
national encodings.  The result is that you can no longer tell where a
unified character is from Korean, Japanese and Chinese, who write them
in slightly different ways.)

If you  want to  test UTF-8  (Why not UTF-16?   People who  really use
computers for  Far East languages (CJK)  would have to  waste 50% disk
space if  they use UTF-8  to store their  text files.  UTF-16  is more
space efficient.),  do suggest  including a UTF-8  test file.   (Add a
line in  HELLO to  instruct anyone  how to open  the UTF-8  test file,
favourably  with hot-key bindings.)   And why  stop there?   Also have
UTF-16 and UTF-7  test files.  UTF-8 is simply  NOT the magic panacea.
It  sucks  when  you have  a  file  full  of Chinese  characters,  for
instance.  The 3-byte per Chinese character "feature" of UTF-8 sucks.

HELLO should remain a test file for the internal encoding "emacs-mule"
and for  displaying the true  multilingual capabilities of  Emacs.  It
has also been serving well to test font installation.  It should never
be  recoded in  utf-8, IMO.   If  all you  care about  is UTF-8,  have
another test  file.  Assuming that all  CJK users should  use UTF-8 is
like assuming that everyone should fall faith to Vatican.

Lee Sau Dan                     李守敦(Big5)                    address@hidden(HZ) 

E-mail: address@hidden
Home page:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]