emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Several serious problems


From: Dave Love
Subject: Re: Several serious problems
Date: 30 Aug 2002 00:19:14 +0100
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

Kenichi Handa <address@hidden> writes:

> I don't know if they are the same as what Dave currently
> has.

I tried to install all the relevant stuff I had, but for the CVS head,
it's modified versions of what I've actually been using, and is
basically untested.  I wanted someone who was actually using that code
base to install it and test it, but no-one could or would -- I can't
remember, but rms leant on me to install it.

> But, I have not checked if they surely works as
> expected.  I believe Dave has done it.

Only in more-or-less Emacs 21.2.

> And, I don't understand why those many functions/variables
> are designed as the current way.  For instance,
> 
> (1) Why does loadup.el has this code:
>       (ucs-unify-8859 'encode-only)
> instead of:
>       (unify-8859-on-encoding-mode 1)

Indeed.  I didn't do that.  The obvious thing to do is to change the
default in the defcustom, if ucs-tables is preloaded.

> (2) Why doesn't utf-8-subst.el provide mappings of
>     non-Chinese characters for ksc, gb, and jisx charsets?
>     The document of utf-8-translate-cjk says as below:
> ----------------------------------------------------------------------
> Whether the `mule-utf-8' coding system should encode many CJK characters.
> 
> Enabling this loads tables which enable the coding system to encode
> characters in the charsets `korean-ksc5601', `chinese-gb2312' and
> `japanese-jisx0208', and to decode the corresponding unicodes into
> ...
> ----------------------------------------------------------------------
> but, currently only Chinese characters in those charsets are
> handled.

I didn't realize that.  It may be coincidence.  What should be
translated is the set of characters

(japanese-jisx0208 ∪ chinese-gb2312 ∪ korean-ksc5601) \ mule-unicode-2500-33ff
                   ^                                  ^
                   union                              set difference

according to the Mule-UCS tables -- I just took the relevant codes
from there above U+33FF.  Perhaps that isn't how it actually is.

It needs someone with an interest in the CJK range to redo that stuff
anyhow; it shouldn't hardwire Japanese as the japanese-jisx0208 as the
preferred set, the sets used should probably be configurable, and it
should allow translating the relevant characters below U+3400.  (I
didn't think much about how best to do that without keeping large
tables on the heap that aren't actually used to do the translation.)

> (3) Why is utf-8-translate-cjk a variable, not a minor-mode
>     like unify-8859-on-(de/en)coding-mode?

I think because it can't be turned off.

>     Or, why the
>     latter is not a simple variable?   By the way, it seems
>     that once we customize utf-8-translate-cjk to t,
>     customize it back to nil doesn't cancel the translation.
> 
> (4) It seems that the variable name
>     utf-8-fragment-on-decoding is not appropriate because it
>     is used also in utf-18.el.  Perhaps,
>     ucs-fragment-on-decoding is better.

Probably.  It was defined before I wrote utf-16.el.  Much of that
stuff would have been written differently for installation in 21.1,
but it was done during the campaign against anything Unicode-based, so
that users could have it in Emacs 21.2 as conveniently as possible.

> (5) It seems that mule-utf-16 can handle the same range of
>     characters as mule-utf-8, but `safe-charsets' property
>     doesn't contain, for instance, `latin-iso8895-2'.
>     Perhaps, this is simply a bug to be fixed easily.

Yes.  The coding system needs to register the relevant translation
table(s) for safe-chars, that would have to be updated in sync with
any changes.  I don't know why that didn't get done.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]