bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tilde (~) in bash(1) is typeset incorrectly as Unicode character


From: G. Branden Robinson
Subject: Re: Tilde (~) in bash(1) is typeset incorrectly as Unicode character
Date: Thu, 27 Jul 2023 12:30:16 -0500

Hi Chet,

At 2023-07-27T11:54:19-0400, Chet Ramey wrote:
> On 7/26/23 11:35 AM, G. Branden Robinson wrote:
> > Many projects don't need to worry about such extreme portability in
> > their man pages, but GNU Bash arguably does.  (I'm open to
> > correction.)
> 
> It's an ongoing struggle. There are projects (e.g., a 4.3BSD
> preservation effort) that request such accommodations, but it's
> becoming more difficult to support them.

I hear you.  Developing groff involves a bit of legacy awareness itself!

> > Furthermore, in the *roff language itself, as originally implemented
> > by Joe Ossanna (and re-implemented by Brian Kernighan) there is no
> > good way to test for the existence of a special character.[...]
> > 
> > As a first stab at it, I'd divide the world into two camps: (a)
> > groff and mandoc(1), and (b) everything else, and not worry about
> > (b).
> 
> I'd consider that, but, as you point out, there are some legacy Unix
> systems that still ship old versions of troff and associated tools.

I was unclear.  By "not worry about (b)", I meant "just define the
string contents as the same old ASCII character".  In retrospect I'm not
sure how anyone was supposed to figure that out from what I typed.

But that is in fact what I did in my suggested patch.  So it should be
highly portable, and I'm happy to support it.

> > The bash(1) man page has an extensive preamble already that still
> > includes a workaround for 4.3BSD(!), so adding a little bit to it to
> > accommodate systems developed since 1990 might not be too
> > disruptive.
> > 
> > I'm attaching a straw man diff to the bash(1) page.  If Chet likes
> > it, I'm happy to prepare one against the bash devel branch.
> 
> Thanks. I'll probably apply some variant of this to the set of man
> pages that need it.

Cool.

> > bash(1) also attempts to select a font named "CW" in places, which
> > is another portability problem (it's a Unix System III [and later]
> > troff font name that was available on _some_ output devices).  But
> > I'd like to see how we get over this bridge before I try to cross
> > that one.  :)
> 
> (For others reading, it's the constant width font, usually Courier.)

This history of the "CW" font name in *roff is becoming clearer to me
and I know of no other narrative about it, so I'm recording this for
posterity--and to invoke Cunningham's Law.[0]

To be precise, it is, or resembles, Courier roman (that is: upright, not
slanted; and of medium stroke weight).  On some output devices,
Documenter's Workbench 3.3 troff (ca. 1990) supported the Courier family
using the names C, CI, CB, and CX (roman, italic, bold, bold-italic).[1]
It also made the roman style available as "CW"--I assume for backward
compatibility with Unix System III (1980), where the brand-new
device-independent troff supported a phototypesetter that featured a
Courier-ish font and offered tools supporting it.[2]  This history is
pretty murky, though; this was commercial Unix troff and licenses for it
were expensive.  That is, I conjecture, the reason that BSD Unix had no
device-independent troff until it adopted groff in the Net/2 release
(1991).[3]  Therefore, device-independent troff font names tended not to
be portable to BSD Unix--not that they were often portable across output
devices anyway, a problem largely tamed nowadays by (1) the dominance of
the PostScript and PDF specifications in this sector, which establish a
base set of workaday typefaces; and (2) groff's insistence on
portability for a base set of font names wherever possible.[4]

Per the copyright page of the first edition of _The C Programming
Language_ (1978), Kernighan & Ritchie must have gotten the monospaced
font into the book because they acquired photographic plates for the
Graphic Systems (by then, Wang) C/A/T in a Courier face (apparently, in
roman only, because no other style was used).  This preceded
device-independent troff[5].  Perhaps by then, a tradition of calling
this face "CW" was entrenched--but I've yet to see any evidence of it in
Seventh Edition Unix (1979).  Possibly its only application was in the
troff sources for the book itself (and the related "C Reference
Manual"), which Prentice-Hall still treats like a trade secret.

> I haven't received any bug reports about that, and groff and mandoc
> both support it, so I'm inclined to leave it alone.

You might start to receive them; stock groff 1.23.0 now produces
font-related diagnostics in many situations where groff 1.22.4 and
earlier did not.[6]  Colin Watson has patched Debian's groff-base
package to suppress this one when man pages are being formatted for
terminals, as suggested by comments in the stock man.local file.[7]  As
far as I can tell, to date other distributors have not.

Regards,
Branden

[0] https://meta.wikimedia.org/wiki/Cunningham%27s_Law
[1] https://github.com/n-t-roff/DWB3.3/tree/master/postscript/devpost

    It also makes a "CO" font available, but as an alias of "C" (roman
    style), not "CI" (Courier-Oblique) as the latter file plainly says.
    I venture no defense of DWB's chaos.  James Clark ensured that
    groff's font naming scheme was much more orthogonal.

[2] https://minnie.tuhs.org/cgi-bin/utree.pl?file=SysIII/usr/src/man/man1/cw.1
[3] 
https://minnie.tuhs.org/cgi-bin/utree.pl?file=Net2/usr/src/usr.bin/groff/CHANGES
[4] 
https://www.gnu.org/software/groff/manual/groff.html.node/Using-Fonts.html#Using-Fonts

    Knuth's Computer Modern faces impose an irritating but surmountable
    lacuna here; see grodvi(1).

[5] 
https://web.archive.org/web/19980422063312/http://cm.bell-labs.com/cm/cs/cstr/97.ps.gz
[6] https://savannah.gnu.org/bugs/?62941#comment5
[7] 
https://salsa.debian.org/debian/groff/-/commit/3e83c059bff4bf20cfb978bf82bf0c28cf7bce93

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]