groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Accessibility of man pages (was: Playground pager lsp(1))


From: Eli Zaretskii
Subject: Re: Accessibility of man pages (was: Playground pager lsp(1))
Date: Sat, 08 Apr 2023 16:42:22 +0300

> Date: Sat, 8 Apr 2023 15:02:59 +0200
> Cc: dirk@gouders.net, linux-man@vger.kernel.org, help-texinfo@gnu.org,
>  nabijaczleweli@nabijaczleweli.xyz, g.branden.robinson@gmail.com,
>  groff@gnu.org
> From: Alejandro Colomar <alx.manpages@gmail.com>
> 
> If you want how symlinks are dereferenced by find(1):
> 
> $ man find | grep sym.*link | head -n1
>        The  -H,  -L  and  -P  options control the treatment of symbolic links.

That's because the text appears verbatim in the man page.  Suppose the
person in question doesn't think about "symbolic links", but has
something else in mind, for example, "dereference".  (Why? because
he/she just happened to see that term in some article, and wanted to
know what does Find do with that.  Or for some other reason.)  Then
they will not find the description of symlink behavior of Find by
searching for "dereference".

Do you see the crucial issue here?  Indexing can tag some text with
topics which do not appear verbatim in the text, but instead
anticipate what people could have in mind when they are searching for
that text without knowing what it says, exactly.

> >> After this patch, if you apropos "system" or "sysctl", you'll see
> >> proc(5) pop up in your list.
> > 
> > This literally adds the text to what the reader will see.  It makes
> > the text longer and thus more difficult to read and parse, and there's
> > a limit to how many key phrases you can add like this.
> 
> If a page has too many topics, consider splitting the page (I agree
> that proc(5) is asking for that job).

Indexing can tag any paragraph of text, not just the entire page.  A
page cannot usefully have too many keywords in its title, but it _can_
benefit from different keywords for different paragraphs.

> >  By contrast,
> > Texinfo lets you add any number of index entries that point to the
> > same text.  A random example from the Emacs manual:
> > 
> >   @cindex arrow keys
> >   @cindex moving point
> >   @cindex movement
> >   @cindex cursor motion
> >   @cindex moving the cursor
> 
> Using consistent language across pages helps for these things.

There's no consistency when we want to be friendly to different people
with vastly different backgrounds and cultural preferences.  Good
indexing will anticipate any "inconsistent" habits.  And, once again,
since the index entries don't appear in the text presented to the
reader, the text remains consistent even if the index entries draw
from different inconsistent sources.

> > Texinfo has:
> > 
> >   - chapters
> >   - sections
> >   - subsections
> >   - subsubsections
> >   - unnumbered variants of the above (unnumberedsubsec etc.)
> >   - appendices (appendix, appendixsubsec etc.)
> >   - headings (majorheading, chapheading, subheading, etc.)
> > 
> > More importantly, all those have meaningful names, not just standard
> > labels like "DESCRIPTION" or "Conversions".
> 
> "Conversions" is not a standard subsection.  It's about conversion
> specifiers; something exclusive of sscanf(3).  However, sections and
> above do be standardized, and I believe that's good, so that you can
> have some a-priori expectations of the organization of a page.

But it then makes it impossible to add sections with meaningful names,
if those names aren't standardized.

> >  So when you see them in
> > TOC or any similar navigation aid, you _know_, at least approximately,
> > what each section is about.
> 
> I know a priori that if I'm reading sscanf(3)'s SYNOPSIS, I'll find
> the function prototype for it.  Or if I read printf(3)'s ATTRIBUTES
> I'll find the thread-safety of the function.

SYNOPSIS is at least approximately self-describing (although some
non-native English speakers might stumble on it).  But how would a
random reader know that ATTRIBUTES will describe thread-safety, for
example?  I wouldn't.  Isn't it better to have a section named "Thread
Safety" instead?

> text search has false positives, like anything else.  But having good
> tools for handling text is the key to solving the problem.  grep(1)
> and sed(1) are your friends when reading man pages.

Modern documentation is not plain text (even if we ignore
compression), so tools which just search the text have limitations,
sometimes serious ones.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]