aspell-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Aspell-user] Adding hyphenated words to private dictionaries


From: Gary Setter
Subject: Re: [Aspell-user] Adding hyphenated words to private dictionaries
Date: Sun, 5 Feb 2006 19:53:39 -0600

----- Original Message ----- 
From: "Kevin Atkinson" <address@hidden>
To: "Gary Setter" <address@hidden>
Cc: "aspell-user" <address@hidden>
Sent: Sunday, February 05, 2006 12:11 AM
Subject: Re: [Aspell-user] Adding hyphenated words to private
dictionaries


> Gary Setter wrote:
>
> > Hi Kevin,
> > Can you give us an idea what you might accept as a solution?
> > Gary
>
> Something that correctly implements the idea I described a few
emails ago,
> this involve at least:
>    1) Adding a new character class for hyphens
>    2) Reworking the code that checks a document
>    3) Intelligently handling the situation when an hyphened
word is misspelled
>
> It isn't easy.  Especially 3)

---- Reply ---
Hi Kevin,

Just for conversation,
The LangImpl class has an enum, like this:
    enum CharType {Unknown, WhiteSpace, Hyphen, Digit,
                   NonLetter, Modifier, Letter};
The types come from the characterset .cset file (e.g.
iso8859-1.cset).
In the iso8859-1.cset file you distribute, the hyphen is properly
defined as a Hyphen type.
But, as far as I can tell, the Hyphen char_type is not being
used.

Take a look at the LangImpl::setup(...) function. It reads the
.cset file and stores the character type in member data
LangImpl::char_type_. So we know which character is a hyphen. We
also have an existing data member for specifying how a character
can be used, LangImpl::special_. All we need to do is set
special_ for the hyphen character to be valid in the middle of a
word, but not the beginning or end.
There are two ways of doing that that I can think of.
1. change the en.dat file to include the 'special' configuration
keyword to setup the hyphen as special.
2. change the LangImpl::setup(...) to check for characters of
type Hyphen and set LangImpl::special_ for those characters to be
valid in the middle, but not beginning or end.

Worth pursuing?
Gary





reply via email to

[Prev in Thread] Current Thread [Next in Thread]