discuss-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Traditional Chinese partially supported


From: Yen-Ju Chen
Subject: Traditional Chinese partially supported
Date: Sun, 10 Mar 2002 19:41:56 -0500

  I made a better 2 bytes characters support.
  Here are two screenshots:
  http://www.people.virginia.edu/~yc2w/GNUstep/Chinese/NSMenu.gif
  http://www.people.virginia.edu/~yc2w/GNUstep/Chinese/Ink.gif

  This only works for Xft (GSAntiAlias = YES)
  because bitmap fonts are almost useless for Chinese characters
  in graphics environment.
  I attach the diff files (compared to lastest CVS, Mar 10, 17:30 GMT)
  and anyone who is interested can try.
  They should not break the GNUstep function even in 1bytes environment
  but I'm not sure.
  I only test it in English(iso8859-1) and in Chinese (iso10646-1).
  Feel free to modify them or added into CVS if they are useful.
  Below is some details about what I did.
  If you are not interested, just ignore it.
  My system is FreeBSD 4.5 + XWindow 4.1.0
  It only works if the internal unicode encoding is UCS-2-INTERNAL.

  Yen-Ju

  In GSFontInfo.m.diff, I just add the NSBIG5StringEncoding support.
  Nothing special. It can be added into CVS without breaking anything.
  In NSString.m, the -initWithData will cut the first 2 bytes of a string.
  I don't know whether it is a bug or not. I just fix it.
  The internal unicode encoding in my system is UNICODELITTLE,
  but I can't find out what is UNICODELITTLE.
  It behaves like UCS-2-INTERNAL.
  Therefore I treat all the unicode (UNICODELITTLE) as UCS-2-INTERNAL.
  In NSString.m, the fix only works for UCS-2-INTERNAL.

  In NSStringDrawing.m, the biggest problem is encode_unitochar doesn't work
  for 2 bytes character.
  A NSGlyph is an unsigned int, which can be converted into an unichar
without losing anything.
  Then I convert each unichar into two chars in UCS-2-INTERNAL encoding
  if the font is ASCII or Unicode encoding.
  So each character in English or in Chinese all use 2 bytes of char in
UCS-2-INTERNAL encoding.
  Now, the char string can be safely put into DPSshow without lose any
information.
  This modification won't affect people who use iso8859-1.

  Because I use XftDrawStringUtf8 in XftFontInfo,
  all the strings have to be converted into UTF8 encoding before drawing.
  That is very easy for most 1 bytes character, but not 2 bytes.
  In XGGState.m, once DPSshow receive an UCS-2-INTERNAL string,
  it is converted into a UTF8 string via NSString's method.
  That's the reason I think there is a bug in NSString.m's initWithData
method.
  It cut the first 2 bytes of UCS-2-INTERNAL string.

  Once the string is UTF8 encoding, the XftFontInfo can draw it via
XftDrawStringUtf8 function.
  I also adjust some codes in XftFontInfo to match the UTF8 encoding string.

  Overall, the path of a string is:
  1. If it use iso10646-2 font, it will be convert into 2 bytes
UCS-2-INTERNAL chars first,
      no matter it is 1byte character or 2bytes character.
      Then UCS-2-INTERNAL will be passed into DPSshow and converted into
UTF8 for drawing.
  2. If it use iso8859-1 font (1byte character), it will keep the same when
it is passed into DPSshow.
      Since for 1 bytes character, the UTF8 code and the iso8859-1 code are
the same,
      pass this string into XftDrawStringUtf8() will work.  No convert is
needed.

  Since GNUstep use either UCS-2-INTERNAL or UCS-2  as internal unicode
encoding,
  it will be good that in Unicode.m, the encode_unitochar() and
encode_chartouni()
  can convert one unicode into two char, and two char into one unicode,
  even it is only a 1bytes character.
  So every character in internal will use 2 bytes no matter which character
it is.
  That's what I did in these files for characters which use
NSUnicodeStringEncoding font..

  Another suggestion is that GNUstep can use DrawStringUtf8 function instead
of DrawString8.
  Since most platform have iso10646 encoding font by default,
  it won't affect the original 1 bytes users.
  Converting 1 bytes character into UTF8 is also very easy
  because they are the same.

  These files work when the system use UCS-2-INTERNAL
  because my system use UCS-2-INTERNAL.
  But it is very easy to support UCS-2 encoding by changing the order of the
2 bytes (one character).
  They also only work when kern == 0, which is the most case people will
use.

  Any suggestion or comment is welcome.
  Feel free to use these codes.
  No copyright. :)


Attachment: gnustep-i18n.tar.gz
Description: GNU Zip compressed data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]