groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Mission statement, second draft


From: Ingo Schwarze
Subject: Re: [Groff] Mission statement, second draft
Date: Thu, 20 Mar 2014 05:06:04 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Hi Deri,

sorry, this got a bit long, but i didn't manage to explain why part
of your arguments seem slightly theoretical without showing a few
practical examples found in the wild.

Deri James wrote on Wed, Mar 19, 2014 at 11:10:56PM +0000:
> On Wed 19 Mar 2014 15:22:42 Eric S. Raymond wrote:
>> Ted Harding wrote:

>>> SO: Supposing that this proposed enterprise goes ahead, WILL WE
>>> STILL BE ABLE TO USE GROFF AS WE ALWAYS HAVE DONE?

>> Yes.

> Except if you are a man page author who wants to use all the troff syntax,
> in which case you will find that "some things" will no longer work,

I admit this is a valid use case:  Write a manual that will not
only be useable on an ASCII terminal with man(1), but *also* result
in above average typesetting when processed with troff, that is,
better typesetting than can be archieved with the standard man(7)
or mdoc(7) macros alone.  However, that may be harder than you
think.  Most people skilled in troff typesetting are *not* used to
making sure that their documents look well in -Tascii, too, but
that *is* important for a manual page.  If you tune your manual page
to look better with troff, but break the way it looks with nroff in
the process, that's a bad tradeoff, because chances are more people
will read it at the terminal.

So, your use case needs:

 (1) A manual page author willing to spend extra time to consider
     non-terminal use cases.
 (2) Skilled in troff typesetting.
 (3) Also skilled in the traps and quirks of nroff portability.

People matching (1) are rare enough.  Most manuals are written by
programmers, and these often want to write code and like writing
documentation less than that.  The combination (1)+(2) seems even
rarer.  Few people skilled in typography resort to writing manuals.
While the skill (3) is not exactly widespread among manual authors,
the combination (2)+(3) can probably only be found in a handful of
specialists.

Or lets look at it the other way round.  When you look at real-world
manuals that *do* resort to low-level roff formatting, what do they
typically look like?  My impression is these are the two most common
cases:

 (a) An inexperienced manual author knowing too little about the
     manual macro set in use resorts to some random low-level
     stuff picked up somewhere by mere chance.  The result is
     often pages looking bad *both* at the terminal and with troff.

 (b) Manuals autogenerated by some tools.  Tool authors (as
     opposed to manual authors) are used to tinkering.  On some
     platform, they run into some (mis-)rendering of some nice
     high-level macro, they go "oh, this doesn't appear to work
     everywhere, let's re-implement this ourselves".  The result
     of course being that what they did will break somewhere else.
     The actual manual authors are unlikely to even be aware of
     what happens with their text behind the scenes.

For case (a), here is an harmless example from slapd.conf(5):

  It has the empty DN, and can be read with e.g.:
  .ti +4
  ldapsearch \-x \-b "" \-s base "+"
  .br
  See RFC 4512 section 5.1 for details.

The same could be done with pure man(7) markup, for example .RS/.RE
or .EX/.EE, so resorting to low-level roff is merely useless, not
particularly harmful beyond being semantically unintelligible.

Actually, case (b) is *much* more common than case (a),
and DocBook is by far the most prolific offender.

Here is a very typical example from the cclive(1) manual:

  .\" Generator: DocBook XSL Stylesheets v1.76.1 <http://docbook.sf.net/>
  .\"      Date: 09/14/2013
  [...]
  .RS 4
  .ie n \{\
  \h'-04'\(bu\h'+03'\c
  .\}
  .el \{\
  .sp -1
  .IP \(bu 2.3
  .\}
  a regular expression pattern
  .RE

Apparently, standard .TP markup didn't seem good enough to these guys,
so they went on a rampage.  It didn't seem to occur to them that
using "\h'-04'\(bu\h'+03'\c" in nroff mode might lead straight
into a portability nightmare.  Right, let's use "\c".  What can
possibly go wrong?

Here is is another one, found in samba(7):

   \" Generator: DocBook XSL Stylesheets v1.74.0 <http://docbook.sf.net/>
   \"      Date: 06/18/2010
  [...]
   \" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   \" SH - level-one heading that works better for non-TTY output
   \" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  .de1 SH
   \" put an extra blank line of space above the head in non-TTY output
  .if t \{\
  .sp 1
  .\}
  .sp \\n[PD]u
  .nr an-level 1
  .set-an-margin
  .nr an-prevailing-indent \\n[IN]
  .fi
  .in \\n[an-margin]u
  [...]

Not only do the comments put in by the autogenerator explicitly state
that the intention of the low-level stuff are *gratuitious styling
tweaks*, and not only are there syntax errors in the code (for example,
incorrect comment syntax), but the code deliberately sends any kind
of portability to hell by assuming that, if some formatter declares
groff compatibility by implementing .de1, it also uses one specific
version of the groff man(7) macros, so we can happily go ahead and
use non-public macros (.set-an-margin) and non-public registers.
Obviously, this is likely to break with the next release of groff.

Still regarding (b), here is a stinking gem from the Tcl manuals.
Instead of using .TP or .Bl -tag or some other high-level macro,
their tools choose to define their own tabbing:

  .\"     # define tabbing values for .AP
  .de AS
  .nr )A 10n
  .if !"\\$1"" .nr )A \\w'\\$1'u+3n
  .nr )B \\n()Au+15n
  .\"
  .if !"\\$2"" .nr )B \\w'\\$2'u+\\n()Au+3n
  .nr )C \\n()Bu+\\w'(in/out)'u+2n
  ..
  .\"     # Start an argument description
  .de AP
  [...]
  .ta \\n()Au \\n()Bu
  [...]

In its full glory, the definition of .AP is about three times
the length of .AS.  Needless to say that such stuff breaks
horribly when run through nroff.  Even with the latest version
of groff, that is.

> or a consumer of man pages who values presentation rather than the
> ability to look at man pages on small phone screens (one of the cited
> advantages of using html).

So, my point is that while the theoretical possibility of hindering
high-quality typesetting of manuals does exist, that problem hardly
occurs in practice, while crappy code breaking even plain terminal
output is everyday business.  The poor consumer, as a rule, will
have no idea what is going on and rather throw their hands up in
dismay than ask for the finer points of typography.

That said, maybe i would still like to preserve the option to write
really good stuff, involving careful use of .ie t ... .el, which is
harder than it sounds.  Even though it hardly ever happens in practice.

[...]
> This seems to be the difference between Ingo and Eric's approach.

It's not the only difference.  Maybe the other one is even more
important:  Eric thinks that man(7) stands a chance to become
the semantic manual markup language of the future (even though
traditional man(7) contained almost no semantic macros whatsoever
and even though man-ext adoption, so far, is negligible), while
i think that it would be best for the future of man(7) to
deprecate man-ext and maintain it as a purely backward-compat
tool to render legacy manuals.

> Ingo is correct in saying we should be trying to win hearts and minds
> of man page authors to use macros which include semantic information,
> but Eric says we must stop any man pages which include presentation
> markup which Doclifter specifically can't handle, from being
> displayable by groff.  The choice then is either those naughty man
> pages get re-written, or they die since neither groff, Doclifter nor
> mdoc can display them.  Either way, Doclifter can then claim to be
> 100% compatible with all man pages which it is possible to display.

Ouch, that's somewhat distorted.  Eric has already won quite a few
hearts to avoid abuse of low-level hacks that hinder semantic
analysis (and believe me, such hacks are typically not typographical
masterpieces, but instead they usually hinder portability and
sometimes clean terminal rendering just as much as semantic analysis).
Eric doesn't want to deprecate the worst low-level hacks because
Doclifter is deficient, but because *no* program will be able to
make head or tail of such stuff.  And the pages will not need to
be rewritten, only patched.  Usually, such patches are rather small.
And the goal isn't to let Doclifter look like a hero, but to have
manuals that allow all of the following: (1) clean rendering to the
terminal  (2) produce fine typography with troff  (3) portability
and (4) allowing semantic analysis.  Other tools will profit *more*
than Doclifter because they contain less artificial intelligence.

It is clear that restricting abuse of low-level hacks in manuals
would help to improve the average formatting quality of manuals
across *all* output media, and that it would be much harder to
archieve as much without any enforcement.  I'm not completely
sure it should be done, it feels a bit strange, as i said, a bit
like a technical solution to a social problem, but the suggestion
is clearly neither absurd nor selfish.

Yours,
  Ingo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]