Re: changing word boundaries

From: Kevin Rodgers
Subject: Re: changing word boundaries
Date: Wed, 11 Nov 2009 07:57:04 -0700
Ernest Adrogué wrote:
 1/11/09 @ 19:09 (+0000), thus spake Dave Love:
Ernest Adrogué <address@hidden> writes:

Hi there,

The Catalan language has a ligature consisting in one
"l" character, followed by a middle dot ("·"), followed
by another "l". See here for more details:·l#Catalan

Is there a way to make emacs aware of this, so that it
doesn't treat a word containing "l·l" as two separate
[You're probably not really interested in word boundaries, just word
constituents.  For an illustration of the difference, see variable
`word-combining-categories' and what capitalized-words-mode does in
Emacs 23.]

You should define a Catalan language environment to be used in ca_ES
locales.  (I'm surprised I didn't do it, as there's a relevant input
method.)  It should set the base syntax of · to word, and set a suitable
default input method.  The existing one, `catalan-prefix', should
presumably bind `~.' to `·', as in latin-prefix; it doesn't currently,
and maybe needs other fixes.

The environment would be something like this (untested), which is
probably better then trying to use categories.  [The default Latin-1
character set is overridden in, say, ca_ES.UTF-8.]
(push '("ca" . "Catalan") locale-language-names)

   "Catalan" '((tutorial . "")   ; maybe...
            (charset iso-8859-1)
            (coding-system iso-latin-1 iso-latin-9)
            (coding-priority iso-latin-1)
            (input-method . "catalan-prefix")
            (nonascii-translation . iso-8859-1)
            (unibyte-display . iso-latin-1)
             . (lambda ()
                 (modify-syntax-entry ?· "w" (standard-syntax-table))))
             . (lambda ()
                 (modify-syntax-entry ?· "_" (standard-syntax-table))))
            ;; Fixme:
            ;; (sample-text . "Spanish (Español)   ¡Hola!")
            (documentation . "\
  This language environment uses the Latin-1 character set, sets
  the default input method to \"catalan-prefix\", and sets the
  syntax of `·' to word.  It selects the Spanish tutorial, in the
  absence of a Catalan translation."))

Thanks a lot. Have you got any idea of where this should be
put in order to be loaded automatically at start-up?

1. C-x C-f ~/.emacs

2. M-x find-library RET default.el

3. M-x find-library RET site-start.el

I tried in init.el, and in a file in the "language" directory
in /usr/share/emacs/23.1/lisp/ to no avail.
It says that there's "no match", when I try to set the language
environment to Catalan interactively.

You could make a bug report if you have more luck than me with reports
about stuff I worked on.

I will try, once I get it to work :)



Kevin Rodgers
Denver, Colorado, USA

