[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Word syntax question
From: |
Kenichi Handa |
Subject: |
Re: Word syntax question |
Date: |
Wed, 22 Oct 2008 21:23:14 +0900 |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/23.0.60 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) |
In article <address@hidden>, "Stephen J. Turnbull" <address@hidden> writes:
> AFAIK Unicode has solved this problem, but I forget where I saw it.
> If my memory is correct, that supports Miles's opinion.
It's "Unicode Standard Annex #29" (http://www.unicode.org/reports/tr29/).
It shows an algorithm to determine if there's a word
boundary between character C1 C2 by categorizing characterers
by "Word_Break" property and giving a set of rules checking
that property.
Emacs already has a similar mechanism by using two variables
word-separating-categories and word-combining-categories.
Please read the docstring of the latter variable.
---
Kenichi Handa
address@hidden