groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Groff and Unicode code-point input.


From: Werner LEMBERG
Subject: Re: [Groff] Groff and Unicode code-point input.
Date: Sun, 15 May 2011 00:45:47 +0200 (CEST)

> I wonder if anybody knows the status of this:
> 
>  http://lists.gnu.org/archive/html/groff/2000-04/msg00036.html
> 
> In short, using \U'N' to input a Unicode codepoint N.

>From groff.info:

   * A glyph for Unicode character U+XXXX[X[X]] which is not a
     composite character is named `uXXXX[X[X]]'.  X must be an
     uppercase hexadecimal digit.  Examples: `u1234', `u008E',
     `u12DB8'.  The largest Unicode value is 0x10FFFF.  There must be at
     least four `X' digits; if necessary, add leading zeroes (after the
     `u').  No zero padding is allowed for character codes greater than
     0xFFFF.  Surrogates (i.e., Unicode values greater than 0xFFFF
     represented with character codes from the surrogate area
     U+D800-U+DFFF) are not allowed too.

   [...]

Note that this mechanism won't work for (printable) ASCII characters,
which you still have to use as-is.  If you use UTF-8 as a Unicode
representation, all characters longer than a single byte can be
converted to the \[uXXXX] representation form.

On the other hand, there is no longer a need to do this manually:
groff comes with `preconv', a preprocessor which can convert virtually
any encoding (using the `iconv' function) to \[uXXXX].


    Werner



reply via email to

[Prev in Thread] Current Thread [Next in Thread]