Re: [PATCH] Unicode Lisp reader escapes

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Unicode Lisp reader escapes

From:	Kenichi Handa
Subject:	Re: [PATCH] Unicode Lisp reader escapes
Date:	Wed, 10 May 2006 14:37:44 +0900
User-agent:	SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI)

In article <address@hidden>, Richard Stallman <address@hidden> writes:

>       In addition, the default value of
>     utf-translate-cjk-mode t, and to which CJK charsets Han
>     characters of Unicode are decoded depends on these:

>     (1) current-language-environment

> What effect does this have?  (Aside from the choice of coding system,
> that is.)

Some Han characters in Unicode can be decoded into several
CJK charsets (e.g. chinese-gb2312, chinese-big5-1,
japanese-jisx0208).  current-language-environment decides
which of them to use.

>     (4) the contents of the hash table ucs-unicode-to-mule-cjk
>     (a user can freely reflect one's preference on how to decode
>     Unicode character by modifying this hash table).

> Could you tell me some examples for how users are really expected
> to use this?

I don't know a concrete example, but I can imagine this.
U+9AD9 is a variant of U+9AD8, but japanese-jisx0208
contains only the latter.  Actually, non of legacy CJK
charset contains U+9AD9.  But, as it is just a variant of
U+9AD8, just for reading, one may want to decode it into
japanese-jisx0208.  In such a case, one can simply do this:

(puthash #x9AD9 ?高 ucs-unicode-to-mule-cjk)

> Overall:

> With so many different variables that might affect the reading of
> these characters, it is just too inconvenient for every file to
> specify them all.  So I think we need a new feature to make that easy
> to do.

> Here's one idea.

> Add a new "variable" `buffer-coding' which is analogous to `coding'.
> Whereas `coding' specifies the encoding in the file, `buffer-coding'
> specifies the in-buffer encoding to produce in the buffer.  Its value
> could be a list or plist, which would specify the values of all these
> many variables.

> What do you think?  If you think this is a good idea, could
> you try designing the details?

No, it's an incredibly hard and heavy task.  When you read
utf-8.el and ucs-tables.el, you'll soon realize that.  I
believe it's just a waste of time to work on such a thing.

We have already done lots of workarounds for workarounds for
workarounds for not using Unicode internally, but there's a
limit.  I believe no one is pleased by producing the same
*.elc in such a situation.

Please accept this problem as a bad feature (not a bug), and
write something in etc/PROBLEMS.  If not, please decide to
shift to emacs-unicode just now.  It's the right thing to
solve this problem.

---
Kenichi Handa
address@hidden

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH] Unicode Lisp reader escapes, (continued)

Prev by Date: Re: comint-insert-input on non-command lines: A trivial fix, a quibble, and a bug
Next by Date: Re: How to change line endings - where is it explained?
Previous by thread: Re: [PATCH] Unicode Lisp reader escapes
Next by thread: Re: [PATCH] Unicode Lisp reader escapes
Index(es):
- Date
- Thread