[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] Groff and Unicode code-point input.
From: |
Werner LEMBERG |
Subject: |
Re: [Groff] Groff and Unicode code-point input. |
Date: |
Sun, 15 May 2011 00:45:47 +0200 (CEST) |
> I wonder if anybody knows the status of this:
>
> http://lists.gnu.org/archive/html/groff/2000-04/msg00036.html
>
> In short, using \U'N' to input a Unicode codepoint N.
>From groff.info:
* A glyph for Unicode character U+XXXX[X[X]] which is not a
composite character is named `uXXXX[X[X]]'. X must be an
uppercase hexadecimal digit. Examples: `u1234', `u008E',
`u12DB8'. The largest Unicode value is 0x10FFFF. There must be at
least four `X' digits; if necessary, add leading zeroes (after the
`u'). No zero padding is allowed for character codes greater than
0xFFFF. Surrogates (i.e., Unicode values greater than 0xFFFF
represented with character codes from the surrogate area
U+D800-U+DFFF) are not allowed too.
[...]
Note that this mechanism won't work for (printable) ASCII characters,
which you still have to use as-is. If you use UTF-8 as a Unicode
representation, all characters longer than a single byte can be
converted to the \[uXXXX] representation form.
On the other hand, there is no longer a need to do this manually:
groff comes with `preconv', a preprocessor which can convert virtually
any encoding (using the `iconv' function) to \[uXXXX].
Werner