[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: severe problems with composite characters
From: |
Kenichi Handa |
Subject: |
Re: severe problems with composite characters |
Date: |
Wed, 17 Sep 2003 15:49:00 +0900 (JST) |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) |
In article <address@hidden>, Werner LEMBERG <address@hidden> writes:
> ======================================================================
> string-width() returns a wrong number if its argument string
> has composite characters.
> Consider two bytes strings 0xcd 0xeb, whose width is one since they
> are composed.
> On Emacs 20.7 string-width() returns 1.
> On Emacs 21.3.50 string-width() returns 2.
??? I've just confirmed this result with 21.3.50.
(string-width (decode-coding-string "\xcd\xeb" 'thai-tis620)) => 1
Please note that Emacs 21 doesn't have a composite character
anymore. For instance, compose-region doesn't change the
characters in a region to a single composite character,
instead it just puts text property `composition'. The
display routine checks this text property and display the
sequence correctly.
I suspect that you evaluated something like this:
(string-width "__some_composed_text__")
in *scratch* buffer. As the Lisp reader ignores any text
properties on reading a string expression in *scratch*
buffer, the string given to string-width doesn't have
`composition' property.
> ======================================================================
> Suppose that composite characters are stored to a file with a
> multi-lingual coding-system. An example is TIS-620 characters with
> UTF-8 (or ctext).
> When Emacs reads the file, the composite characters are not composed
> since there is no post-conv function associated to the multi-lingual
> coding-system.
> Is this a bug?
As such a post conv function is rather heavy, it is by
default turned off. When you customize the variable
utf-8-compose-scripts to t, Thai characters should be
composed on decoding.
But, I've just found a bug in this facility, and installed a
fix. Please update your working directory, and try again.
Don't forget to do "make autoloads" in "lisp" subdirectory.
---
Ken'ichi HANDA
address@hidden