[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Future direction of groff

From: Eric S. Raymond
Subject: Re: [Groff] Future direction of groff
Date: Sat, 8 Feb 2014 19:58:31 -0500
User-agent: Mutt/1.5.21 (2010-09-15)

(Sorry for delayed response; I lost four days to the power outage.)

James K. Lowden <address@hidden>:
> I didn't ask you why it was difficult to conjure structure where none
> was intended.  My question is much simpler: what about the troff
> *model* of presentation prevents it from generating web-digestible
> artifacts?

Two things: Page-centric markup assumptions and presentation-centric markup
assumptions.  These fail badly in any situation where you cannot predict the 
rendering capabilities of the output device at the time you write the markup.
The Web is such a medium.

There are two ways you can work around this.  One is easy but brittle
and extremely failure-prone.  The other is extremely difficult but
correct and works much better, especially as your document complexity

Under both methods, there are some structural features you can map over
directly.  Document section headings are the most obvious example.
Where they differ is how you handle presentation-level markup.

The brittle method is to map presentation-level troff markup to
presentation-level HTML.  So, for example, \fIfoo\fR -> <i>foo</i>.
This is what grohtml and most standalone programs like man2html do.

The problem with this is that approach is that you get fundamentally
stupid HTML from it. Ideally, you would like things that human beings
read as semantic markup to turn into semantic markup so the stylesheet
and browser can make better rendering choices.  For example,
Unix-style man page references such as "foobar(8)" should, ideally
turn into document links. Constant-width displays of program code
should turn into <code> rather than just <pre><tt> so they can 
be gray-boxed. man/mm/ms/me lists should map to HTML lists. And so forth.

The troff model cannot give you this kind of semantically sensitive
rendering, because the information required to do that is thrown aware
at macroexpansion time.

The difficult but correct thing to do is to recover structural
information by looking for cliches in the source markup *before* it
goes through troff.  That is what doclifter does, and why the
path through Docbook-XML yields better HTML than grohtml or man2html.

This method bypasses troff. The point is not to throw away the structural
information that troff processing discards.

> You say "groxslfo" is unecessary work.  In your expert opinion, why?  

For the same reasons writing a calculator program that does
Roman-numeral I/O would be unnecessary work. Because the assumptions
on which low-level troff markup was designed no longer match the reality
of how people need to write documents.  Assumptions including: (a) there
os a page size, and you know what it is, (b) you know exactly what fonts
will be available, (c) you can predict how the document will re-flow.

The features that can and should be saved from the wreckage are those
which are, or can be massaged into, purely semantic/structural markup. That
kind of markup is still useful in a world of multiple output modes.
Thus: pic, chem, grap, eqn, etc.  

At best, groxslfo would be a convenience for people in whom the troff
way of thinking is deeply ingrained.  This is a small and dwindling group -
I would be quite surprised if there are more than two hundred of us
left on the whole planet, and we're getting old.
                <a href="";>Eric S. Raymond</a>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]