groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Getting properly rendered single quotes in groff


From: Clarke Echols
Subject: Re: [Groff] Getting properly rendered single quotes in groff
Date: Fri, 06 Jun 2008 14:35:51 -0500
User-agent: Thunderbird 2.0.0.14 (Windows/20080421)

Well, as the world-renowned "expert" around HP back in 1989-1992
when I had all of Sections 1, 1M, 2, 3, 4, 5, 7, and 9 (glossary),
it's a good question.

As a general practice, I tried to avoid the need for single quotes
in that context, because the single quote is so frequently used for
directing the parser or compiler in keyboard commands, shell scripts,
C programs, and the like.

I opted for font changes to isolate the character, thereby eliminating
the single quote entirely.

The problem is that these issues are relics from the 1970s when Unix
was at AT&T Bell Labs, and there were no industry standards of usage.

When I took over the HP-UX Reference in 1989, I immediately started
making, and enforcing, changes in all HP man pages, despite the
highly vocal objections and outcries from engineers who insisted that
I must follow lock-step with the original AT&T documents.  When the
issue reached management, the attempted overthrow failed, and I won.
The "fight" lasted only a very few minutes.

When a manager insisted there was no time in their engineers' schedules
to make the changes I was enforcing, I did it for them, as they watched,
using shell scripts running vi non-interactively and running sed from
the vi command scripts.  (That's a real "trip" -- watching them as
300 files are completely reformatted and recoded from raw troff to
properly constructed use of man macros in a few minutes live, on
screen, as they sit there.  Devastating. :-) )  [Sorry folks, sometimes
I get a bit "assertive".  Especially when customer usability is the
primary issue.]

I drove my changes on the requirement of consistency.  The original
AT&T man page for cp(1) had at the top of the page in the header,
"CP(1)".  In the SYNOPSIS, it was "cp" in bold.  If it was in the
DESCRIPTION, it was in italics, and if it was the first word in a
sentence, it was "Cp".  I was completely vindcated when we got a
customer comment card from a customer in Japan who complained that
when he typed "Cp", he got the error "Command not found", and the
same when he tried "CP".  One must always be willing to fight for
the interests of the customer (gee, what a concept!).

My rules were simple:

Any characters that must be typed on the keyboard or in code *exactly*
as specified (literals, command/utility names, function names, etc.)
were *always* typeset in a Courier font.  No exceptions.

All variable names, arguments to commands, functions, and such or
symbolic text (such as "filename", "arg", etc.) were *always* in
italic.

When any argument had to be enclosed within single or double quotes,
the quotes were in Courier, even in what's between them was in italics.
And if several arguments were enclosed in double quotes (as in the
previous paragraph above), the comma is ALWAYS outside the quotes.
Some English teachers or professors will protest, but it is the ONLY
way to prevent confusion.

Bold was never used for anything except strong emphasis.

If a command or function name was the first word in a sentence,
it was typeset in Courier *****EXACTLY***** as it must be typed from
the keyboard.  The rule that the first letter in a sentence is always
a capital letter is bogus.  Good style manuals state that trademarks
and legal terms MUST be typeset exactly as legally defined, and that
means lowercase at the start of a sentence where required.
I took that one "to the mat", so to speak (note use of comma after
double quote), and won again.

The only time punctuation is placed inside the closing quote mark
(unless it must be typed that way to keep the system, interpreter,
or compiler happy) is in quoting conversation or what someone said,
as in:

   He told me, "That is a silly idea.  I never said such a thing."

Whereas in other instances, this is much more appropriate:

   Name callers may identify him as a "thug", "troublemaker",
   "nice guy", or "a general nobody", but I hold a higher opinion.

Compare that with:

   Name callers may identify him as a "thug," "troublemaker,"
   "nice guy," or "a general nobody," but I hold a higher opinion.

The rule I would use to resolve this is that if the word preceding
the first opening quote is followed by a comma, then the last word
before the closing quote would, in most cases, also be followed by
a comma before the quote.

My general rule of thumb is: When writing so people can understand
you clearly and correctly, punctuate as if you were coding a program
so it can be correctly parsed by the compiler.  Proper use of commas,
quotes, and other punctuation must lead, if not force, the reader to
come to the correct interpretation.

And if an English teacher says don't use commas unless they are
really necessary, I counter with: "If removing a comma introduces
the possibility that *anyone* might misinterpret the intent of the
writer, leave it in."  Proof?  Remove any commas from the last
sentence in the preceding paragraph and see what it does to the
clarity and preciseness.

My favorite definition is:  Communication is not saying something so
you can be understood.  Communication is saying what you want to say
in a manner that cannot possibly be _mis_understood.

As for the single-quote issue, if you cannot solve it with fonts
(which can be a problem when nroff is the formatter because it doesn't
put courier font specs in bold/inverse video), you can always use the
.if n and .if t conditionals, with \c at the end of the line where
needed to prevent insertion of unwanted white space.

It's not as clean and elegant, but it does solve the typography
question.

Otherwise, take the easy road out.  Just use `x' and let ASCII do
its thing.  It's not a big issue because most readers know what's
going on and how to interpret it correctly.

Clarke


Michael Kerrisk wrote:
Hello all

I'm the maintainer of the Linux man-pages package (~800 pages in
sections 2, 3, 4, 5, 7 of the Linux man pages).

Stuart Brady pointed out to me some inconsistencies in the use of
single quotes in the man-pages package, and also some problems in the
groff rendering of single quotes in UTF-8. I''d like to clean things
up, but I'm not 100% sure of the right approach.

What I'm most concerned about is rendering of character consnats ('c')
when written as part of normal text (i.e., not program code examples).
 For example, in a sentence like:

    Trialing '/' characters are not counted as part of
    the pathname.

Here's how I currently understand things:

1) Using the form
'x'
is rendered by groff in an ugly way (as two closing quotes) in UTF-8,
but looks okay in ASCII.

2) Using the form
`x'
is rendered nicely by groff for UTF-8 (proper opening and closing
single quotes), but looks ugly in ASCII, since ` and ' are not
visually balanced.  (I see that this usage is common, and wonder
whether it rendered differently for ASCII in historical roff systems?)

3) Using the form
\'x\'
renders acceptably (in my opinion) in UTF-8, since \' is rendered as
a "vertical apostrophe quote"; and it renders in ASCII in the same way as
'x'
(which is acceptable).

So, I was starting to think that option 3 might be the way to go, but
Andries pointed out:

[[
The definition of \' is unambiguous: it is defined to be the
acute accent, to be equivalent to \(aa in groff(7).
But you do not want to get an acute accent. You want the ASCII single quote.

The definition of ' in groff(7) as I have it here is wrong:
it says "apostrophe, right quotation mark, single quote",
but "right quotation mark" is entirely wrong, as Markus Kuhn documented.
So, ' is *not* right quotation mark, and if groff today shows it as one,
then this bug will be fixed next week or next year.
]]

Stuart pointed out that \(aq probably gets around Andries's objection,
so now I'm wondering if the solution is

4) Using the form
\(aqx\(aq
renders acceptably in UTF-8, since  \(aq is rendered as a "vertical
apostrophe quote"; and it renders in ASCII in the same way as
'x'
(which is acceptable).

I'd appreciate some input on what the best solution is (which of
course might be something other than one of the solutions above).

Cheers,

Michael







reply via email to

[Prev in Thread] Current Thread [Next in Thread]