[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: neatroff for Russian. (Was: Questions concerning hyphenation pattern

From: Oliver Corff
Subject: Re: neatroff for Russian. (Was: Questions concerning hyphenation patterns for non-Latin languages, e.g. Russian)
Date: Wed, 26 Apr 2023 19:33:48 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1

Hi Robin and Branden,

On 26/04/2023 15:16, G. Branden Robinson wrote:
At 2023-04-26T15:16:55+0300, Robin Haberkorn wrote:
For future texts I therefore wanted to return to Groff (where we also
have the excellent MOM macros). Not being able to hyphenate UTF-8
Cyrillic text is a major limitation for me. I might get away with
converting it to KOI8 first, but could I still mix in Unicode
characters this way (as they are considered special characters by

I have similar needs as you in processing UTF-8 Cyrillic text (mostly
not Russian, though).

Mixing two different encodings in one document is generally not a very
feasible idea, and typically single-byte values may be displayed by a
single generic placeholder. Open, for instance, any KOI8-R encoded
document in an utf8-terminal; you either get something that looks like
two-letter combinations or question marks all over the KOI8-R part(s) of
the document. While a machine could, in theory, deal with such a matter,
it is simply a nuisance for a human editor/author to have to work with
such an input.

Be sure you review my earlier messages to Oliver in detail.  The
hyphenation code isn't "broken", it's simply limited to the C/C++ char
type for character code points and hyphenation codes (which are not "the
same thing as" character code points, but do correspond to them).

I am not familiar with modern incarnations of C/C++. Is there really no
char data type that is Unicode-compliant?

Best regards,



Dr. Oliver Corff
Wittelsbacherstr. 5A
10707 Berlin
Tel.: +49-30-85727260

reply via email to

[Prev in Thread] Current Thread [Next in Thread]