groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Puzzle about ms files generated by pandoc


From: G. Branden Robinson
Subject: Re: Puzzle about ms files generated by pandoc
Date: Fri, 8 Jul 2022 20:33:43 -0500

Hi Robert,

At 2022-07-07T08:00:20-0400, Robert Goulding wrote:
> I used pandoc to generate nicely formatted printouts of s.tmac
> (because I want to study it and understand a little better how it
> works). I tried several highlight styles; and found that when I used
> groff to compile the "monochrome" style, it failed with the error
> "groff ms macros require groff extensions; aborting" -- which is odd,
> because of course I'm using groff!
> 
> A partial pdf of the ms macros was generated, but stops the line
> before the source reads "groff ms macros require groff extensions;
> aborting" -- so that is surely significant. But when I looked at the
> files generated for monochrome and for another highlight style
> ("kate"), I couldn't see where the error might be. I've attached both
> - can anyone spot why groff may be failing *as if *groff were not
> being used in the groff -ms macros?

I was able to reproduce your problem with the following command.

$ groff -ms -Tpdf ms-monochrome.ms >| m.pdf
groff ms macros require groff extensions; aborting

The first thing I did was to enable warnings.  I encourage everyone to
do this.  :)

$ groff -ww -ms -Tpdf ms-monochrome.ms -z
troff: ms-monochrome.ms:126: warning: macro '.' not defined
troff: ms-monochrome.ms:127: warning: missing closing delimiter
groff ms macros require groff extensions; aborting

This alone is enough to make me suspect a usage problem.

Here's what these lines look like.

$ sed -n '126,127p' ms-monochrome.ms
\*[FunctionTok ".."]
\*[FunctionTok ".if"]\*[StringTok " !"]\*[OtherTok "\[rs]n(.g"]\*[StringTok " 
"]\*[CharTok "\[rs]"]

Okay, that looks moderately hairy.  Whatever generated this is using
groff parameterized strings whose arguments are themselves exhibits of
groff syntax.  This strikes me as a good way to get into trouble; some
care will be required.

Evidently the "kate" and "monochrome" files define the `FunctionTok`
string differently.  How?

$ grep 'ds FunctionTok' *.ms
ms-kate.ms:.ds FunctionTok \\m[644a9b]\\$1\\m[]
ms-monochrome.ms:.ds FunctionTok \\$1

So "monochrone", which fails, simply returns the argument unprocessed,
whereas "kate" brackets it with color escapes.

Suddenly this looks like a familiar problem.  I play a hunch.

$ groff -ww -ms -Tpdf ms-monochrome-gbr.ms -z
troff: ms-monochrome-gbr.ms:142: warning: macro 'CH' not defined
troff: ms-monochrome-gbr.ms:2252: warning: macro 'pdfsync' not defined

We got further.  What did I do?

$ diff -u ms-monochrome.ms ms-monochrome-gbr.ms
--- ms-monochrome.ms    2022-07-08 19:58:25.396932753 -0500
+++ ms-monochrome-gbr.ms        2022-07-08 20:13:07.088864226 -0500
@@ -18,9 +18,9 @@
 .ds BaseNTok \\$1
 .ds FloatTok \\$1
 .ds ConstantTok \\$1
-.ds CharTok \\$1
+.ds CharTok \&\\$1
 .ds SpecialCharTok \\$1
-.ds StringTok \\$1
+.ds StringTok \&\\$1
 .ds VerbatimStringTok \\$1
 .ds SpecialStringTok \\$1
 .ds ImportTok \\$1
@@ -28,8 +28,8 @@
 .ds DocumentationTok \\f[CI]\\$1\\f[C]
 .ds AnnotationTok \\f[CI]\\$1\\f[C]
 .ds CommentVarTok \\f[CI]\\$1\\f[C]
-.ds OtherTok \\$1
-.ds FunctionTok \\$1
+.ds OtherTok \&\\$1
+.ds FunctionTok \&\\$1
 .ds VariableTok \\$1
 .ds ControlFlowTok \\f[CB]\\$1\\f[C]
 .ds OperatorTok \\$1

I prefixed each of these string definition with the non-printing input
break escape sequence[1] so that when these strings are interpolated at
the beginning of an input line (or as the consequent of a conditional
request like `if`, `el`, `while`, or `nop`), they won't be
misinterpreted if they start with a control character.

The remaining problems revealed by -ww to me appear to be unrelated.
Some stuff like ".rm CH" seems dubious.  It is better to give that
string an empty definition than to remove it.  I see use of non-groff
macros like `pdfsync`.  I assume they supply a definition of the last
somewhere.

Is pandoc is responsible for generating this output?  If so it sounds
like a bug report is in order.  They _really_ need to fix those string
definitions to start with `\&` unconditionally.  Those that already
start with `\f` or `\m` will escape the problem, but it's harmless and
might be a less intrusive change for their groff code generator to start
all of them with `\&` regardless.

Also, I think the generator should declare itself in a preamble comment.

If any pandoc hackers are monitoring this list, please speak up.  We can
help.

Regards,
Branden

[1] Ouch!  Who has a voodoo doll of me?

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]