OK, now the biggest problem. What about CJK support?? Pretty please
with sugar on top! I'd need all three of them! Just left-to-right CJK
support would be quite enough. With CJK support I guess unicode would
come in handy as well...
I asked these questions in my first mail to this list (actually only
about the "J" in "CJK"), seems that there is nothing like this
currently. I also need that, but not too urgently (the project I need
that for won't have anything to print before 2006 I guess, until then
everything is XML and the jTeX output module works most of the time).
So I currently consider implementing this myself.
First I thought about implementation via Lout's filters, which would
generate Lout symbols for every CJK character which would in turn
generate postscript code, character by character. Probably with some
scaling like `1.0f @Wide @Scale' -- per character. *very* inefficient
though and might not work well in all situations (eg getting this into
the databases that are used for translating words like "Figure",
"Appendix" etc). Especially I'm not sure about whether the default CJK
PostScript fonts (Ryumin-Light and GothicBBB-Medium) should be typeset
non-proportionally with all characters the same widths. I frequently
read japanese texts that seem to be typeset with proportional fonts.
Proportional typesetting would make the filter-script much more
difficult, and getting this work nicely with the current Lout fontsize
etc is even more a hurdle.
I also considered, doing this the CJK-TeXish way: splitting japanese
fonts into subfonts,each with some 96 chars (like eg on JIS
Then to reference a character (with the right widths from the font
metrics) could be done via some font-switching code. Decoding japanese
input coding would still be difficult. Two methods seem possible: (1)
again use some filter script. (2) define each japanese character in
its, say EUC-JP code via `def'. Lout's `def' allows names with
non-alphanumeric characters which should do the job. Problem is, that
Lout may classify some charcodes > 128 as letters (latin-1 accented
characters etc, expert guide, p13) which would interfere with the
japanese charcode definitions. Don't know wheter that can be disabled.
This is also quite a hack and will interfere with Lout's font handling
code. It will also again lead to problems with getting japanese
characters into those standard language-dependent strings. Another
problem is Lout's limit on the total number of fonts (256 I think).
Chinese, Japanese and Korean won't fit into 256 subfonts. At least not
when using multiple font styles (mincho vs. gothic etc.)
Another problem is Japanese line breaking. Some simple algorith seems
to be applicable here (with modern, proportional japanese typesetting,
older typesetting style with fullstops hanging outside right margin etc
might be more difficult to achieve). Just define a list of characters
that are not allowed to remain as the last character of the line, and a
list of characters that must not start a line. This seems to be
sufficient, at least for Japanese.
That algorithm can both be implemented with a filter script and even
with the `def'-style decoding: Just define all characters, that mustn't
start a line, as operators that bind with the previous character into
one unbreakable compound.
After all those considerations I'm almost at the point where I want to
badly hack the Lout source code: Make everything unicode (32bit per
char), allow UTF-8 as only input coding system. Add Unicode->Whatever
transcoding tables for fonts, maybe add some method for defining
fontsets consisting of multiple Postscript fonts (so that eg one can
typeset Latin, Japanese, Chines and Korean with the default Roman
Also the hyphenation engine would need to be hacked to support those
primitive Japanese line breaking rules.
Not sure about whether vertical typesetting could be implemented
Well, one simple method would be rotating the font and rotating the
in opposite direction. Heck that's simple :). Hacking lout's galley
flushing algorithm is definitely one of the things I do *not* want to
Sorry for that lenghtly vapourware description. It might help
motivating me if I know that at least one other person requres CJK in
Lout. And knowing whether my implementation ideas seem sensible or
nonsese to others. If you have some time, I would definitely need help
on the "C" and "K" sides of CJK (typesetting rules, postscript font