Re: nesting lists in man(7) pages

From: Ingo Schwarze
Subject: Re: nesting lists in man(7) pages
Date: Wed, 11 Aug 2021 14:37:58 +0200
Hi Branden,

G. Branden Robinson wrote on Wed, Aug 11, 2021 at 01:16:45AM +1000:
> At 2021-08-04T13:33:30+0200, Ingo Schwarze wrote:

>> I can confirm Branden's observation that lists are where people (and
>> even more so, tools automatically generating man(7) code) often
>> produce low-quality man(7) code.  I'm not quite sure why, though.

> I'm _so_ tempted to say, "failure to understand the target language",

Sounds possible, even plausible.

> but I reckon I should actually step up and try to improve one of the
> generator tools, like pod2man

That would be a noble deed because pod2man(1) is a very useful tool
and produces almost decent man(7) code in most respects, though i
don't doubt it can be made even better.  Also, helping Russ Allbery
would feel like giving something back to a venerable greybeard of
of outstanding merit.

> or docbook-to-man, before making this claim.

I pity you if you try, but you would be a hero if you succeeded.

Feedback regarding mandoc -Tman is also welcome.  I admit maintenance
done on it petered out a bit during the last few years.  It appears
systems not supporting mdoc(7) and hence requiring distribution of
auto-generated man(7) pages in release tarballs alongside the original
mdoc(7) code become less and less common, so mandoc -Tman is seeing
less and less use, seemingly.

>> The .IP macro seems almost as good as .Bl -bullet/-dash to me,

> You mentioned somewhere that we might infer link targets from IP tags.
> That's technically true, but if they're used as recommended in
> groff_man_style(7), they _won't_ convey semantic information of use.
> They really will just be numbers in a list, or bullet symbols or the
> like.  That's one reason my style advice reads that way--to stregthen
> TP's semantic value.

Fair enough.

Indeed, in mandoc(1), man_validate.c function post_IP(), which checks
for content to automatically tag, does not tag the recommended .IP
content you describe above:

 * Skip leading whitespace, dashes, backslashes, and font escapes,
 * then create a tag if the first following byte is a letter.
 * Priority is high unless whitespace is present.

And in fact, it also does not tag if that first character is
a character escape sequence, for example \(bu.

>> Sure, .TP/.IP lists cannot nest, or is there another problem with them
>> i'm missing?

> Au contraire!  You can nest them if you use .RS/.RE.  My struggles with
> mixing .RS and .IP was in fact the issue that inflicted me on all of
> you[1][2].
> $ ./build/test-groff -Tutf8 -man -rLL=72n EXPERIMENTS/

Technically, that's not nesting, but one .IP list, then a second,
indented .IP list, then another .IP list:

   $ man -l -T tree
      IP (block) *10:2
        IP (head) 10:2
            \(bu (text) 10:5
        IP (body) 10:2
            Hire gnomes. (text) *11:1.
      RS (block) *12:2
        RS (head) 12:2
        RS (body) 12:2
            IP (block) *13:2
              IP (head) 13:2
                  \(dg (text) 13:5
              IP (body) 13:2
                  Collect underpants. (text) *14:1.
            IP (block) *15:2
              IP (head) 15:2
                  \(dg (text) 15:5
              IP (body) 15:2
                  ??? (text) *16:1.
            IP (block) *17:2
              IP (head) 17:2
                  \(dg (text) 17:5
              IP (body) 17:2
                  Profit! (text) *18:1.
      IP (block) *20:2
        IP (head) 20:2
            \(bu (text) 20:5
        IP (body) 20:2
            Retire to tropical paradise. (text) *21:1.

You see that the .RS is *not* inside the "Hire gnomes" .IP,
but on the same level, interrupting the list.  That's not a mandoc
implementation detail.  It's crucial because if any parser/formatter
would put the .RS inside the .IP, you would get double indentation
which would clearly be wrong.

This is not terminological hairsplitting, but a very real problem
for HTML rendering:

   $ man -l -T html
  <p class="Pp">We have a foolproof scheme for getting rich.</p>
  <ul class="Bl-bullet">
    <li>Incorporate in Cayman Islands.</li>
    <li>Acquire venture capital.</li>
    <li>Hire gnomes.</li>
  <div class="Bd-indent">
  <dl class="Bl-tag">
    <dd>Collect underpants.</dd>
  <dl class="Bl-tag">
    <dd>Retire to tropical paradise.</dd>

Changing the HTML formatter to put the inner list into the "Hire
gnomes" <li> would be quite a challenge.  By contrast, doing just
that for nested .Bl in mdoc(7) is trivial.


