Re: Playground pager lsp(1)

help-texinfo
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Playground pager lsp(1)

From:	Alejandro Colomar
Subject:	Re: Playground pager lsp(1)
Date:	Sat, 8 Apr 2023 00:01:08 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1
Hi Eli,

On 4/6/23 10:11, Eli Zaretskii wrote:
>> Date: Thu, 6 Apr 2023 03:10:59 +0200
>> Cc: dirk@gouders.net, linux-man@vger.kernel.org, help-texinfo@gnu.org
>> From: Alejandro Colomar <alx.manpages@gmail.com>
>>
>>> This last sentence is a misunderstanding.  The goal of Texinfo is not
>>> to improve the man pages.  Texinfo is a completely different approach
>>> to software documentation, which allows to write large books and then
>>> produce various on-line and off-line formats to read and efficiently
>>> search those books.
>>
>> "The manual was intended to be typeset; some detail is sacrificed on
>> terminals." (man(1), _Unix Time-Sharing System Programmer's Manual_,
>> Eighth Edition, Volume 1, February 1985)
>>
>> You mean books like this one?  Courtesy of groff(1)'s Deri James =)
>> <https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/book/man-pages-6.04.01.pdf>
>>
>> Or maybe you prefer HTML?
>> <https://man7.org/linux/man-pages/man1/intro.1.html>
> 
> No, I mean books like "GNU Emacs Manual" or "Debugging with GDB"
> (https://shop.fsf.org/collection/books-docs).  Or "War and Peace", for
> that matter.
> 
>> As to efficiency, I'm not going to open that melon, because we're
>> both very biased to be efficient on the formats we each maintain.
>> I'll just say that I don't see an objective winner in those terms.
> 
> How do you find the description of, say, "dereference symbolic link"
> (to take just a random example from the Emacs manual) when the actual
> text of the manual include neither this string nor matches for any
> related regular expressions, like "dereference.*link"?

$ apropos link | grep sym | head -n5
readlink (2)         - read value of a symbolic link
readlinkat (2)       - read value of a symbolic link
sln (8)              - create symbolic links
symlink (2)          - make a new name for a file
symlink (7)          - symbolic link handling

I bet you're looking for readlink(2) and symlink(7), aren't you?

> 
> The way Info does it is to use the index (which should be present in
> any respectable reference document) to find description of the
> corresponding subject.  The indexing, which is done by the author of
> the document, if it's a good indexing, should include index entries
> that specify subjects the reader could have in mind when he/she is
> looking for this kind of information.

We do that too in man(7).  For example, we improved the "index" for
proc(5) recently, after наб lost some time without finding proc(5)
in the list of pages that were interesting for the topic at hand:


commit 2e1c1a57f138eedd35b7b2a825002fddb12d240f
Author: наб <nabijaczleweli@nabijaczleweli.xyz>
Date:   Sat Apr 1 00:04:52 2023 +0200

    proc.5: NAME: Add "system information, and sysctl"
    
    procfs hosts a whole host of information about the system, as well as
    sysctls; proc(5) hosts a description of a lot of sysctls, and at present
    there's no way to find that out.
    
    Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
    Cc: Jakub Wilk <jwilk@jwilk.net>
    Signed-off-by: Alejandro Colomar <alx@kernel.org>

diff --git a/man5/proc.5 b/man5/proc.5
index 521402fe8..233cc1c9d 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -36,7 +36,7 @@
 .\"
 .TH proc 5 (date) "Linux man-pages (unreleased)"
 .SH NAME
-proc \- process information pseudo-filesystem
+proc \- process information, system information, and sysctl pseudo-filesystem
 .SH DESCRIPTION
 The
 .B proc


After this patch, if you apropos "system" or "sysctl", you'll see
proc(5) pop up in your list.

> 
> The corresponding index-searching commands of Info readers are a
> primary means for finding information quickly and efficiently,
> avoiding too many false positives and also avoiding frustrating
> misses, i.e., searches that fail to find anything pertinent.

That's no different than apropos(1).  The only problem is when a
man page feels like a one-page book.  But if you split the book
into several pages, then the index is useful to know which page
you want.

> 
> So this is not about objectivity, this is about features that either
> are present in the documentation system or are absent.  I prefer the
> Info format to the HTML format of the same manual for this single
> reason: HTML browsers don't have the index searching capabilities
> (this is hopefully about to change, I hope, see the JS support in
> latest Texinfo), and that issue alone was enough to avert me from
> HTML, because I cannot afford wasting time on looking up information I
> cannot find instantly.

Yep, I also prefer man(1) over HTML man pages for similar reasons :).
I can do whatis(1) and apropos(1) (although some man-pages websites
have this capability too, but then I can't grep those results in the
browser).

> 
>> About variety of output formats, anything that can be produced by
>> groff(1), man(7) can be translated.  And groff(1) can do many formats.
> 
> Groff (and any other typesetting program) can be used for writing any
> kind of documents.  I'm not talking about the processors, I'm talking
> about the design of the documentation system as a whole and about what
> the products actually look like.  IOW, I'm talking about the man pages
> produced by the typesetter, not about what can be done with the
> typesetter.
> 
>>> Man pages have no means of specifying structure
>>
>> .SH, .SS, .TP, .TQ, and very soon (hopefully weeks not months) .MR
> 
> These provide just one level.

We have many levels:

book:           /opt/local/foobar/man/
volume:         man2/, man3/, ...
chapter:        man3/, man3type/, ...
page:           sscanf(3)
section:        sscanf(3)/DESCRIPTION
subsection:     sscanf(3)/DESCRIPTION/Conversions
tags:           sscanf(3)/DESCRIPTION/Conversions/n

Branden, I now remember your wondering about MR and linking to
specific locations in a page...  Maybe we could use such a URI-like
syntax for that.  I guess it's not yet taken by any software, so we
should be free to define paths in the 'man:' schema to mean this?

> 
> And how frequently are they used in actual man pages out there, even
> when available?

Used in source man(7)?  Always.

> 
>> Those can be used to produce very precise links such as this one
>> (one of my favourite references when reviewing man-pages patches):
>> <https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/book/man-pages-6.04.01.pdf#pdf%3Abm11886>
> 
> It's full of mojibake when I try reading it here.  But anyway: what
> structure do you have there?  It looks just a long sequence of
> separate man pages.

There's a navigation panel in the left in most (all?) PDF readers.
You can use that to navigate to the page you want, and get hyperlinks
to pages or their contents.

> 
>>> and hyper-links except
>>> by loosely-coupling pages via "SEE ALSO" cross-references at the end;
>>> they have no means of quickly and efficiently finding some specific
>>> subject except by text search (which usually produces a lot of false
>>> positives).
>>
>> I guess you mean searching from the command line by the name of the
>> parameter to a function, or what?
> 
> No, I mean looking a specific subject of interest without having to
> search/read through the entire document.

See symlink above.

> 
>> I would be interested in a more detailed description of what you
>> want to be able to search in current pages (hopefully ones that I
>> maintain, so I can speak of them) that you can't find easily?  Maybe
>> I can help making something more accessible.
> 
> See above, the example of using index-searching commands.

Yep.  I hope my answer about symlinks satisfied you.

Cheers,
Alex

> 
>>> By contrast, Texinfo documents have sectioning structure, have
>>> cross-references that can appear where you need them and point
>>> anywhere else in the document (or into another document).
>>
>> This was discussed as a possible extension to '.MR'.  We're just not
>> sure that there's a real need for that in manual pages (although
>> there's not a consensus on that regard, and Branden, which I'm sure
>> is reading this, may jump in at any moment :).
> 
> Cannot say about man pages, but in a serious documentation of any
> computer software you always need cross-references, because you cannot
> make any description self-contained without repeating the same stuff
> over and over and over again.
> 
> Here's a short examples from a random place in the Emacs Lisp
> Reference manual:
> 
>      When an editing command returns to the editor command loop, Emacs
>   automatically calls ‘set-buffer’ on the buffer shown in the selected
>   window (*note Selecting Windows::).  This is to prevent confusion: it
>   ensures that the buffer that the cursor is in, when Emacs reads a
>   command, is the buffer to which that command applies (*note Command
>   Loop::).  Thus, you should not use ‘set-buffer’ to switch visibly to a
>   different buffer; for that, use the functions described in *note
>   Switching Buffers::.
> 
> The three places which say with "see SOMETHING" are cross-references
> to other parts of the manual.  Without being able to cross-reference
> there, the text would have to explain what it means by "selected
> window", what it means by "commands" and "command loop", and mention
> explicitly the functions to switch to a buffer which are already
> described in detail elsewhere.  This allows readers who already know
> about those subjects to read the text without having to skip large
> amounts of unnecessary information, while also allowing readers who
> are not sure they know about that to be able to follow the link, read
> there, and then come back to the same place to continue reading.
> 
>>>  They also
>>> have indexing and commands that allow the reader to use the index in
>>> order to find the subject he/she is interested in very quickly and
>>
>> You mean whatis(1) and apropos(1)?
> 
> No.  These perform text searches on the titles of the man pages, and
> are therefore limited to what is in the title.  Indexing is much more
> powerful, and works on the topics in the index (which, as explained
> above, could contain text not present anywhere else in the document).
> And every respectful Info manual has an index (some have several
> indices).  See above about the commands which use the index.
> 
>>> accurately, even if the text of the index entry doesn't appear
>>> anywhere in the manual.
>>
>> man pages have several ways:
>>
>> -  Including keywords in the NAME section.
>> -  Link pages.
>> -  TH line.
> 
> This is not enough, IME.  You need a way of "tagging" a chunk of text
> as describing, or being pertinent to, a particular subject, even if
> that subject does not appear literally in the text the reader will
> see.  That's because when readers are after some specific material,
> they don't always have in mind the exact words used in the manual for
> describing that material, they could have some alternative phrases in
> their minds.  Good indexing anticipates this in advance, and provides
> index entries for those alternative phrases, allowing readers to find
> stuff quickly.
> 
>> Of course, this is for the terminal.  For PDF or HTML, you can
>> get hyperlinks to any subsection (and in the future maybe even
>> tagged paragraphs) within a page.
> 
> In Info, references to any paragraph are available since long ago.
> They are invaluable in some situations, especially when some section
> is very long and you want to point to a very specific part thereof.
> 
>>> How can you document a large and flexible software package, such as
>>> GDB or Texinfo or Emacs, in man pages?
>>
>> git is a huge program, yet its man pages are quite useful.
> 
> Git is a huge heap of separate commands, with very little to glue them
> together in terms of documented functionalities.  Still, even in Git,
> there's the stuff that belongs to neither command in particular, and
> thus is documented in man pages with invented names like
> "gitrevisions", which is impossible to guess in advance for a newbie
> who needs this information.
> 
> Moreover, the introduction material and the explanation of basic
> concepts is not in man pages, but in a separate HTML document ("The
> Git User's Manual"), and likewise the API documentation, which in
> itself is a telltale sign.
> 
> While something like a huge heap of man pages is perhaps borderline
> reasonable for Git, it isn't reasonable for programs which are not
> easily broken into separate independent "pages", like GDB and Emacs.
> The more complex is the system of objects and concepts manipulated by
> the software, the less appropriate is the man-page format for
> describing it.
> 
>> Just split your documentation at the right boundary, which
>> usually requires a good design for your software that allows
>> such division.
> 
> Whether the manual is split or not is immaterial.  Info manuals can
> also be split.  The relevant issue is what the viewer allows the
> reader to do to read these chunks in a reasonable way, using efficient
> commands and features to find related information quickly.
> 
>> The fact that current man(1) implementations don't exploit
>> the whole power of man(7) doesn't mean you can't design a
>> software that does.
> 
> Indeed, it doesn't mean that.  But we are discussing what is there,
> not what could be there in some distant future.
> 
>> I'm sure you could build something similar to info(1) that
>> got man(7) pages as its input.
> 
> No!  The information about subsections, cross-references, and indices
> is missing.  That information must be there to begin with, otherwise
> it cannot be recreated, because it's inserted by humans, not by
> programs.
> 
>>> It isn't missing.  The TOC is presented as top-level menu in each
>>> manual, and large manuals have also the "detailed menu" with all the
>>> sub-nodes spelled out.  In addition, the Emacs Info reader has the
>>> Info-toc command, which presents a structured menu with all the
>>> sectioning levels of a manual even if the manual didn't produce it.
>>
>> Ahh, yes, this is true.  What I found missing is a kind of a map for
>> knowing what I have available for navigating (also the fact that I
>> don't usually run info(1) makes me be a bit fuzzy on detailing what
>> is it that I miss from it).  So, info(1) has a map of the sections
>> available in a page, and does it also have a map of all the pages
>> in the system (or whatever you call your pages, I don't yet really
>> understand the organization of info manuals).
> 
> Yes, it does.  If you invoke 'info' with no arguments, it will show
> the "directory" of all the installed manuals -- a large menu where
> each manual has at least one line explaining what the manual
> describes.  Some manuals have much more than one line; examples
> include Coreutils and Binutils (which have a line for each individual
> command) and glibc (which has a line for every _function_).

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
OpenPGP_signature
Description: OpenPGP digital signature
[Prev in Thread]
Current Thread
[Next in Thread]
Info vs man [was: Playground pager lsp(1)], (continued)
Prev by Date: reformatting man pages at SIGWINCH (was: Playground pager lsp(1))
Next by Date: Re: reformatting man pages at SIGWINCH
Previous by thread: Re: Playground pager lsp(1)
Next by thread: Re: Playground pager lsp(1)
Index(es):
- Date
- Thread