[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Applications of \c in man pages in the wild [LONG]

From: Ingo Schwarze
Subject: Re: [Groff] Applications of \c in man pages in the wild [LONG]
Date: Mon, 1 May 2017 17:46:30 +0200
User-agent: Mutt/1.6.2 (2016-07-01)

Hi Branden,

G. Branden Robinson wrote on Sun, Apr 30, 2017 at 07:51:26PM -0400:

> some of these categories are going to be hard to recognize without
> a standalone *roff parser, which I don't think exists.

I'm working on that in mandoc, albeit rather slowly.

Mandoc is a four-phase program.  The order of the phases is always
the same, but command line options may cause individual phases to be

  phase 1: file selection (e.g. database or file system search)
           [skipped in "man -l"]
  loop over selected files
    phase 2: parser (for roff, mdoc, man, tbl, eqn)
             [skipped in "man -k" unless "-a" is given]
    phase 3: formatter (ascii, utf8, html, ps, pdf, man, markdown)
             [skipped in "man -Tlint"]
  phase 4: pager (usually more(1) or less(1))
           [skipped in "man -c"]

The output of phase 2 that is passed as input to phase 3 is an
abstract syntax tree, either an mdoc(7) AST or a man(7) AST.

Phase 2 currently consists of three sub-phases:

  phase 2.1: roff(7) prepocessor
             input: roff(7) text file
             output: mdoc(7) or man(7) text file
               no longer containing any low-level roff(7) elements;
               for example, all register, string, and macro
               definitions and interpolations are evaluated
               and expanded, etc.
  phase 2.2: mdoc(7) or man(7) parser
             generating the raw AST
  phase 2.3: mdoc(7) or man(7) normalizing validator
             modifying the AST

During the last about two years, i already unified all the data
structures and node handling utility functions and i'm now near the
point where the roff(7) preprocessor can slowly begin to evolve
into a real pre-parser: that is, where it can begin to add low-level
roff(7) nodes to the AST parse tree in addition to preprocessing
the input.

Even in an intermediate state where only some roff constructs will
be parsed into the AST, that concept may start to become useful for
syntactic and semantic analysis of mixed roff(7)/mdoc(7) and mixed
roff(7)/man(7) sources.

Roff nodes in the AST are not available yet, though.  So far, i
only established the technological foundation to build them on, as
described above.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]