[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Regarding HTML rendering

From: Peter Schaffter
Subject: Re: [Groff] Regarding HTML rendering
Date: Fri, 18 Aug 2017 13:08:35 -0400
User-agent: Mutt/1.5.24 (2015-08-30)

On Fri, Aug 18, 2017, Steve Izma wrote:
> It seems to me this argument implies a desire for a
> general-purpose markup language for creating typeset documents,
> which I think is an impossible goal.

A couple of years ago I was consulted by a programmer tackling the
"perfect markup language" problem.  I had myself attempted such
a thing long before Markdown was even around.  Then, as now, I
concluded it isn't possible to create a one-size-fits-all markup
language.  (IIRC, the developer predictably got stuck on the blank
line issue.)

When he first approached me, he asked, "What would your ideal markup
language look like?" My answer was: the mom macros.  As Steve has
pointed out, pure groff is entirely presentational but macros can be
as semantically meaningful as you like.  Mom separates out semantic
macros, stylesheet macros, and presentational macros such that a
well-formed mom document is already unambiguously parsable for
conversion into any sgml, which makes it a markup language.  (Very
likely into HTML5, too, although I haven't tried it.)

I recently published a 100+ page paper online that put groff/mom to
the test.  It threw the worst typesetting nasties at the document:
spaced paragraphs (even bottom margins, anyone?), extensive use of
heads (up to four or five levels), floated material in the form
of images/tables/diagrams, font switches for both semantic and
expressive purposes, embedded inline images.  You'd expect it to be
a parsing nightmare, but the reverse is true.  The formatted paper
is online at

The subject may not be of interest, but the typesetting and
formatting are splendid examples of the sophistication of groff
as a typesetting engine, and mom as a markup language.  In the
body of the paper, there are, I believe, only three low-level
groff requests: one '.fzoom', one unavoidable '.ns', and one
macro re-definition.  Everything else is handled by semantically
meaningful macros, or presentational macros that are easily
identified as such and can be ignored by a parser (e.g. the FLEX
macro).  Expressive formatting is handled inline, either
through open-close tags or named characters whose definitions
include formatting directives, e.g.

  .char \[staccato] \*[IT]staccato\*[PREV]

which avoids clutter but makes "special" words stand out.

I've made a copy of the source file available for download at

so people can have a look.  (The associated fonts, sourced files,
and images are not included, so don't try to format the document.
It's just there to demonstrate good mom markup.)

> I think MOM has much more concisely named requests that are meant
> for semantic clarity.

Thanks, Steve, for the best laugh I've had in ages.  Mom concise?
Ya gotta be kidding!

Peter Schaffter

reply via email to

[Prev in Thread] Current Thread [Next in Thread]