bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A fix for bad man pages display on UTF-8 locales (patch for groff)


From: Aleksander Adamowski
Subject: Re: A fix for bad man pages display on UTF-8 locales (patch for groff)
Date: Fri, 18 Jul 2003 17:03:14 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5a) Gecko/20030709

I've found a bug in Groff 1.18's UTF 8 device definitions. If that's already known, please forgive me. I couldn't locate any info in the mailing list archive.

I'm using Mandrake Linux 9.1, there's an installer option "Use Unicode by
default".

This option causes UTF-8 versions of locales to be used, which
finally moves the distribution in the direction of unifying diverse
character set encoding standards into a single, well-known and well
understood encoding: UTF-8.
However, there is a problem with some UTF-8 locales with regards to
man pages.
The problem exhibits itself in hyphens (e.g. in the option names)
being displayed incorrectly and being unsearchable (the "minus"
character from the keyboard doesn't match them).

This is due to the fact that the groff utility that's used for
formatting pages (when called from the nroff shell script) formats
"\-" sequence in the source input as Unicode character "0x2212", and
"-" character as Unicode character "0x2010" instead of the
backward-compatible minus sign (which has code "0x002D" for
compatibility with ASCII).

The hyphen sign "0x2212" isn't handled properly by either the less
viewer, or the output terminal and as a result it's displayed with a
leading garbage character and can't be input from the keyboard when
searching in the manual page (so that e.g. it isn't possible to
search for "-h" option when reading the manual for ls).

Among others, the "en_US.UTF-8" locale is influenced by this
bug. OTOH, some other locales (e.g. "pl") aren't influenced by it
because the nroff wrapper has a quick hack which switches from UTF-8
to legacy encodings (like ISO-8859-2) for those locales, since man
pages are still encoded in non-UTF8 charsets. See the source of
/usr/bin/nroff script for details.

The problem is solved by modifying groff's font descriptions for the
utf8 device so that the standard, ASCII-compatible "0x002D"
character code is used instead of "0x2212" for the hyphen sequence
("\-").

The font settings for utf8 device are in the
/usr/share/groff/1.18.1/font/devutf8/ directory, in the files R (for
regular text), B (for bold), I and BI (for italic an bold-italic
respectively).

I'm attaching a patch that fixes groff's font settings.
diff -urN /usr/share/groff/1.18.1/font/devutf8.orig/B 
/usr/share/groff/1.18.1/font/devutf8/B
--- /usr/share/groff/1.18.1/font/devutf8.orig/B 2003-04-30 14:06:52.000000000 
+0200
+++ /usr/share/groff/1.18.1/font/devutf8/B      2003-04-30 17:43:49.000000000 
+0200
@@ -285,7 +285,7 @@
 +h     24      0       0x03D1
 +f     24      0       0x03D5
 +p     24      0       0x03D6
--      24      0       0x2010
+-      24      0       0x002D
 hy     "
 en     24      0       0x2013
 em     24      0       0x2014
@@ -334,7 +334,7 @@
 st     24      0       0x220B
 product        24      0       0x220F
 sum    24      0       0x2211
-\-     24      0       0x2212
+\-     24      0       0x002D
 mi     "
 **     24      0       0x2217
 sr     24      0       0x221A
diff -urN /usr/share/groff/1.18.1/font/devutf8.orig/BI 
/usr/share/groff/1.18.1/font/devutf8/BI
--- /usr/share/groff/1.18.1/font/devutf8.orig/BI        2003-04-30 
14:06:52.000000000 +0200
+++ /usr/share/groff/1.18.1/font/devutf8/BI     2003-04-30 17:44:28.000000000 
+0200
@@ -285,7 +285,7 @@
 +h     24      0       0x03D1
 +f     24      0       0x03D5
 +p     24      0       0x03D6
--      24      0       0x2010
+-      24      0       0x002D
 hy     "
 en     24      0       0x2013
 em     24      0       0x2014
@@ -334,7 +334,7 @@
 st     24      0       0x220B
 product        24      0       0x220F
 sum    24      0       0x2211
-\-     24      0       0x2212
+\-     24      0       0x002D
 mi     "
 **     24      0       0x2217
 sr     24      0       0x221A
diff -urN /usr/share/groff/1.18.1/font/devutf8.orig/I 
/usr/share/groff/1.18.1/font/devutf8/I
--- /usr/share/groff/1.18.1/font/devutf8.orig/I 2003-04-30 14:06:52.000000000 
+0200
+++ /usr/share/groff/1.18.1/font/devutf8/I      2003-04-30 17:44:48.000000000 
+0200
@@ -285,7 +285,7 @@
 +h     24      0       0x03D1
 +f     24      0       0x03D5
 +p     24      0       0x03D6
--      24      0       0x2010
+-      24      0       0x002D
 hy     "
 en     24      0       0x2013
 em     24      0       0x2014
@@ -334,7 +334,7 @@
 st     24      0       0x220B
 product        24      0       0x220F
 sum    24      0       0x2211
-\-     24      0       0x2212
+\-     24      0       0x002D
 mi     "
 **     24      0       0x2217
 sr     24      0       0x221A
diff -urN /usr/share/groff/1.18.1/font/devutf8.orig/R 
/usr/share/groff/1.18.1/font/devutf8/R
--- /usr/share/groff/1.18.1/font/devutf8.orig/R 2003-04-30 14:06:52.000000000 
+0200
+++ /usr/share/groff/1.18.1/font/devutf8/R      2003-04-30 17:45:02.000000000 
+0200
@@ -284,7 +284,7 @@
 +h     24      0       0x03D1
 +f     24      0       0x03D5
 +p     24      0       0x03D6
--      24      0       0x2010
+-      24      0       0x002D
 hy     "
 en     24      0       0x2013
 em     24      0       0x2014
@@ -333,10 +333,11 @@
 st     24      0       0x220B
 product        24      0       0x220F
 sum    24      0       0x2211
-\-     24      0       0x2212
+\-     24      0       0x002D
 mi     "
 **     24      0       0x2217
 sr     24      0       0x221A



reply via email to

[Prev in Thread] Current Thread [Next in Thread]