groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Bullets in manual pages and -K groff option


From: Alexander E. Patrakov
Subject: Re: [Groff] Bullets in manual pages and -K groff option
Date: Thu, 26 Jan 2006 14:19:56 +0500
User-agent: Debian Thunderbird 1.0.2 (X11/20051002)

Bruno Haible wrote:

Alexander E. Patrakov wrote:
The answer "patch
glibc so that iconv transliterates the bullet to 'o'" is better (and in
fact this is doable), but I think that users of non-Glibc systems (or
old Glibc) will complain if this becomes the official answer.

Why should they complain? They can use GNU libiconv. It transliterates the
bullet to 'o', like you wish.
The "iconv" program from libiconv transliterates the bullet to ".", which is also acceptable. Also, it transliterates quotes nicely. Thanks!

As for the "iconv" program from glibc, the situation is worse. I have prepared a patch against Glibc-2.3.6 (attached) that transliterates the offending characters produced by Groff into their ASCII equivalents if there is no any other suitable fallback. You can try it without rebuilding glibc by applying it to the installed copy of the "translit_neutral" file (in /usr/share/i18n/locales) and rebuilding all locales with localedef. The patch works in all locales except "C" (see below), but libiconv provides nicer quotes. Is this patch a right solution?

As for the "C" locale, the problem is that "iconv" from Glibc uses transliteration data from the current locale (e.g., in order to substitute รค with ae in German locales), and such locale-specific transliteration table is missing for the "C" locale (which IMHO is a Glibc bug). In contrast to that, libiconv bases its decisions only upon the source and destination character sets.

So, if you agree with all of the above, please help formulating a well-stated bug report against Glibc. Draft (very bad) is below.

Bug1.
Subject: Allow transliteration in the "C" locale.
Component: libc
Description:
The iconv function from libiconv performs some useful transliterations (e.g., replacing fancy quotes with their ASCII equivalents) in any locale. iconv from Glibc doesn't do that and relies solely upon the transliteration data from the current locale. Thus, there are no transliterations in the "C" locale, although they would be useful.

The iconv function from glibc should probably, instead, rely upon the union of locale-agnostic transliteration rules (like those from libiconv) and locale-specific overrides.

Bug2.
Subject: Transliterate quotes and bullets in all locales.
Component: localedata
Description:
The iconv function from libiconv performs some useful transliterations (e.g., replacing the quotes with their ASCII equivalents and the middle dot with ASCII dot) in all locales. Iconv implementation from Glibc doesn't always do this. Such deficiency is going to hurt future Groff users, as described in [link to this thread]. Attached is a patch that implements the needed transliteration rules. See also [Bug 1] for the related issue with the "C" locale.

--
Alexander E. Patrakov
Submitted By: Alexander E. Patrakov
Date: 2006-01-26
Initial Package Version: 2.3.6
Upstream Status: Discussing
Origin: Alexander E. Patrakov
Description: Transliterates some characters (e.g., ones created by groff -Tutf8)
into their ASCII approximations.

--- glibc-2.3.6/localedata/locales/translit_neutral     2006-01-26 
13:52:16.000000000 +0500
+++ glibc-2.3.6/localedata/locales/translit_neutral     2006-01-26 
11:15:17.000000000 +0500
@@ -26,6 +26,10 @@
 <U00AD> <U002D>
 % REGISTERED SIGN
 <U00AE> "<U0028><U0052><U0029>"
+% ACUTE ACCENT
+<U00B4> <U0027>
+% MIDDLE DOT
+<U00B7> <U002E>
 % CEDILLA
 <U00B8> <U002C>
 % RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
@@ -39,9 +43,9 @@
 % LATIN SMALL LETTER AE
 <U00E6> "<U0061><U0065>"
 % MODIFIER LETTER PRIME
-<U02B9> <U2032>;<U00B4>
+<U02B9> <U2032>;<U00B4>;<U0027>
 % MODIFIER LETTER DOUBLE PRIME
-<U02BA> <U2033>;"<U00B4><U00B4>"
+<U02BA> <U2033>;"<U00B4><U00B4>";"<U0027><U0027>"
 % MODIFIER LETTER TURNED COMMA
 <U02BB> <U2018>
 % MODIFIER LETTER APOSTROPHE
@@ -55,7 +59,7 @@
 % MODIFIER LETTER MACRON
 <U02C9> <U00AF>
 % MODIFIER LETTER ACUTE ACCENT
-<U02CA> <U00B4>
+<U02CA> <U00B4>;<U0027>
 % MODIFIER LETTER GRAVE ACCENT
 <U02CB> <U0060>
 % MODIFIER LETTER LOW MACRON
@@ -101,11 +105,11 @@
 % NARROW NO-BREAK SPACE
 <U202F> <U00A0>;<U0020>
 % PRIME
-<U2032> <U00B4>
+<U2032> <U00B4>;<U0027>
 % DOUBLE PRIME
-<U2033> "<U2032><U2032>";"<U00B4><U00B4>"
+<U2033> "<U2032><U2032>";"<U00B4><U00B4>";"<U0027><U0027>"
 % TRIPLE PRIME
-<U2034> "<U2032><U2032><U2032>";"<U00B4><U00B4><U00B4>"
+<U2034> "<U2032><U2032><U2032>";"<U00B4><U00B4><U00B4>";"<U0027><U0027><U0027>"
 % REVERSED PRIME
 <U2035> <U0060>
 % REVERSED DOUBLE PRIME
@@ -155,7 +159,7 @@
 % ASTERISK OPERATOR
 <U2217> <U002A>
 % BULLET OPERATOR
-<U2219> <U2022>;<U00B7>
+<U2219> <U2022>;<U00B7>;<U002E>
 % DIVIDES
 <U2223> <U007C>
 % RATIO
@@ -171,13 +175,13 @@
 % MUCH GREATER-THAN
 <U226B> "<U003E><U003E>"
 % DOT OPERATOR
-<U22C5> <U00B7>
+<U22C5> <U00B7>;<U002E>
 % VERY MUCH LESS-THAN
 <U22D8> "<U003C><U003C><U003C>"
 % VERY MUCH GREATER-THAN
 <U22D9> "<U003E><U003E><U003E>"
 % MIDLINE HORIZONTAL ELLIPSIS
-<U22EF> "<U00B7><U00B7><U00B7>"
+<U22EF> "<U00B7><U00B7><U00B7>";"<U002E><U002E><U002E>"
 % SYMBOL FOR NULL
 <U2400> "<U004E><U0055><U004C>"
 % SYMBOL FOR START OF HEADING

reply via email to

[Prev in Thread] Current Thread [Next in Thread]