groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: WAYTO: indexed man pages


From: James K. Lowden
Subject: Re: WAYTO: indexed man pages
Date: Wed, 2 Jun 2021 16:29:42 -0400

On Mon, 31 May 2021 14:48:42 -0400
Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:

> > Now that I think of it, the current system is makewhatis/apropos. I
> > often get a ton of noise entries, usually perl modules, but maybe
> > there?s a way around that.
> 
> apropos whatever | grep -v  '(3p'

That's what we have, but is it the best we can do?  I bet we agree the
answer to that question is No.  

Try for example 

        apropos color

On my system, with xlib and some perl stuff installed, that yields 178
entries. I can winnow it down to 38 with:

        man -k color | grep -v ^'[xX]' | grep -v :: 

including ppmquant (for X) and ctail (for dot) and 3 items releated to
what I want, dircolors.  That's a 2% signal/noise ratio.  Ironically,
most of the apropos output is not apropos to the input.  

The problem, I assert, is lack of context.  apropos has no way to know
I"m interested in colors for filenames in ls.  

The fiirst order of business IMO is to navigate large documents by
indexed keyword, something the info reader does tolerably well.  

More sophisticated -- and requiring no new input -- would be an ability
to "zoom out" of the context of one manpage to show related pages that
reference the term. If I'm reading the ls(1) page and don't find what I
want, what's "in the neighborhood"?  Well, dpkg tell us that ls(1) is
part of GNU coreutils. AFAIK, the man system offers no way to ask "what
coretutils manpages reference color"?  A further outer ring of
association can also be derived from the packaging system, namely
packages that depend on the package, or that it depends on, or that are
recommended, subject to the constraint (or not) that they're
installed.  

Another basis for "zoom out" could be the kind of work that made Google
rich: citation counts.  If ls(1) references certain documents or
environment variables, what other documents reference those same
documents/variables?  If many do, that's information.  It's not rocket
surgery, either; it's basically what cscope has been doing for 30
years for function calls. 

ISTM that we rely too heavily on general tools like regular
expressions, and don't exploit information already present in our
systems.  We're training ourselves to create "google-able" terms --
like go-lang for the Go language -- because general purpose search
engines lack context specifiers.  

We also don't leverage the documentation writer's expertise
and *time*: any effort to add index terms to documentation is nothing
next to the thousands or millions of times that page will be read and
searched.  That is why I want to provide authors with macros for index
terms: to let them to express their expertise for the benefit of all.  

I don't see how the value of a subject index can be doubted, given that
every large body of information is indexed, be it the Encyclopedia
Britannica or your local library. Nor is technical feasibility a high
bar.  The real obstacle, as ever, is people.  

Where there's a will there's a way.  But: is there a will? 

--jkl



reply via email to

[Prev in Thread] Current Thread [Next in Thread]