[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
chinese word mode
From: |
Eric Abrahamsen |
Subject: |
chinese word mode |
Date: |
Tue, 05 Nov 2013 17:11:47 +0800 |
User-agent: |
Gnus/5.130008 (Ma Gnus v0.8) Emacs/24.3 (gnu/linux) |
So the follow-up to my earlier message is that I'm trying to create a
chinese-word-mode, which will behave (almost exactly) like the existing
thai-word-mode defined in lisp/language/thai-util.el and friends.
The idea is that an entire dictionary of words are provided in a nested
char table, and then a minor mode both remaps most word-related commands
to use that dictionary, and fill-find-break-point-function is rewired to
do the same. The Thai version looks like this:
(define-minor-mode thai-word-mode
:global t :group 'mule
(cond (thai-word-mode
;; This enables linebreak between Thai characters.
(modify-category-entry (make-char 'thai-tis620) ?|)
;; This enables linebreak at a Thai word boundary.
(put-charset-property 'thai-tis620 'fill-find-break-point-function
'thai-fill-find-break-point))
(t
(modify-category-entry (make-char 'thai-tis620) ?| nil t)
(put-charset-property 'thai-tis620 'fill-find-break-point-function
nil))))
I have shamelessly copied most of the code, and begun reworking it for
Chinese. But I'm confused about the charset specifications above.
Thai has only two charsets (one of which is thai-tis620), while Chinese
has more than a dozen (though I'm only messing with simplified Chinese
for now, so call it six or so).
My buffers are utf-8 encoded, and describe-char on a Chinese character
shows "preferred charset: unicode-bmp". So what do I put for the charset
in order to make these functions target the right characters? Chinese
characters all seem to have the "|" line-breakable category by default,
but (I think) I can only add the custom fill break point function one
charset at a time.
Thanks!
Eric
- chinese word mode,
Eric Abrahamsen <=
Re: chinese word mode, William Xu, 2013/11/06