groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questions concerning hyphenation patterns for non-Latin languages, e


From: Oliver Corff
Subject: Re: Questions concerning hyphenation patterns for non-Latin languages, e.g. Russian
Date: Tue, 25 Apr 2023 16:25:49 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.0

Hi Branden,

Now I am considering modifying an existing TeX hyphenation file for
groff use.
Wait!  Before you do that, check out the post-1.23.0 branch in groff
Git, where this has already been done!

https://git.savannah.gnu.org/cgit/groff.git/?h=post-1.23.0

In the meantime, I had a look at that Russian hyphenation file, and to
my relief, the structure of the groff hyphenation pattern files is that
of TeX hyphenation pattern files, which I have worked on before.

But... the hyphenation file hyphen.ru in the aforementioned source is
not usable in the current set-up because the Russian syllable fragments
are encoded in KOI-8, an 8 bit encoding based on a GOST Standard of the
USSR.

So, it does not match the internal code representation of Unicode code
points.

Since groff internally seems to work with Unicode code positions, the
question is: in which format should the hyphenation patterns be
presented to groff? As-is, that is as utf8 text, or in \[u04xx] form?
That does not seem to work either, according to my last experiment.

Best regards,

Oliver.

--
Dr. Oliver Corff
Wittelsbacherstr. 5A
Mail: oliver.corff@email.de




reply via email to

[Prev in Thread] Current Thread [Next in Thread]