[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Puzzle about ms files generated by pandoc
From: |
Robert Goulding |
Subject: |
Re: Puzzle about ms files generated by pandoc |
Date: |
Sat, 9 Jul 2022 01:00:57 -0400 |
That's quite some detective work!
Yes, this is code generated by pandoc. I made a simple markdown file, ms.md:
````roff
<contents of s.tmac>
````
And then processed it with: pandoc --highlight-style=monochrome -s -o
ms-monochrome.ms ms.md
and that's the file I posted, unaltered.
If there are no pandoc hackers here, with your permission I'll report your
findings to the pandoc bug list. (One of the other highlight styles,
haddock, gives rise to a similar error, so something has clearly gone awry
with their templates for a couple of the styles).
Thanks for looking into this with such thoroughness! - Robert.
On Fri, Jul 8, 2022 at 9:33 PM G. Branden Robinson <
g.branden.robinson@gmail.com> wrote:
> Hi Robert,
>
> At 2022-07-07T08:00:20-0400, Robert Goulding wrote:
> > I used pandoc to generate nicely formatted printouts of s.tmac
> > (because I want to study it and understand a little better how it
> > works). I tried several highlight styles; and found that when I used
> > groff to compile the "monochrome" style, it failed with the error
> > "groff ms macros require groff extensions; aborting" -- which is odd,
> > because of course I'm using groff!
> >
> > A partial pdf of the ms macros was generated, but stops the line
> > before the source reads "groff ms macros require groff extensions;
> > aborting" -- so that is surely significant. But when I looked at the
> > files generated for monochrome and for another highlight style
> > ("kate"), I couldn't see where the error might be. I've attached both
> > - can anyone spot why groff may be failing *as if *groff were not
> > being used in the groff -ms macros?
>
> I was able to reproduce your problem with the following command.
>
> $ groff -ms -Tpdf ms-monochrome.ms >| m.pdf
> groff ms macros require groff extensions; aborting
>
> The first thing I did was to enable warnings. I encourage everyone to
> do this. :)
>
> $ groff -ww -ms -Tpdf ms-monochrome.ms -z
> troff: ms-monochrome.ms:126: warning: macro '.' not defined
> troff: ms-monochrome.ms:127: warning: missing closing delimiter
> groff ms macros require groff extensions; aborting
>
> This alone is enough to make me suspect a usage problem.
>
> Here's what these lines look like.
>
> $ sed -n '126,127p' ms-monochrome.ms
> \*[FunctionTok ".."]
> \*[FunctionTok ".if"]\*[StringTok " !"]\*[OtherTok
> "\[rs]n(.g"]\*[StringTok " "]\*[CharTok "\[rs]"]
>
> Okay, that looks moderately hairy. Whatever generated this is using
> groff parameterized strings whose arguments are themselves exhibits of
> groff syntax. This strikes me as a good way to get into trouble; some
> care will be required.
>
> Evidently the "kate" and "monochrome" files define the `FunctionTok`
> string differently. How?
>
> $ grep 'ds FunctionTok' *.ms
> ms-kate.ms:.ds FunctionTok \\m[644a9b]\\$1\\m[]
> ms-monochrome.ms:.ds FunctionTok \\$1
>
> So "monochrone", which fails, simply returns the argument unprocessed,
> whereas "kate" brackets it with color escapes.
>
> Suddenly this looks like a familiar problem. I play a hunch.
>
> $ groff -ww -ms -Tpdf ms-monochrome-gbr.ms -z
> troff: ms-monochrome-gbr.ms:142: warning: macro 'CH' not defined
> troff: ms-monochrome-gbr.ms:2252: warning: macro 'pdfsync' not defined
>
> We got further. What did I do?
>
> $ diff -u ms-monochrome.ms ms-monochrome-gbr.ms
> --- ms-monochrome.ms 2022-07-08 19:58:25.396932753 -0500
> +++ ms-monochrome-gbr.ms 2022-07-08 20:13:07.088864226 -0500
> @@ -18,9 +18,9 @@
> .ds BaseNTok \\$1
> .ds FloatTok \\$1
> .ds ConstantTok \\$1
> -.ds CharTok \\$1
> +.ds CharTok \&\\$1
> .ds SpecialCharTok \\$1
> -.ds StringTok \\$1
> +.ds StringTok \&\\$1
> .ds VerbatimStringTok \\$1
> .ds SpecialStringTok \\$1
> .ds ImportTok \\$1
> @@ -28,8 +28,8 @@
> .ds DocumentationTok \\f[CI]\\$1\\f[C]
> .ds AnnotationTok \\f[CI]\\$1\\f[C]
> .ds CommentVarTok \\f[CI]\\$1\\f[C]
> -.ds OtherTok \\$1
> -.ds FunctionTok \\$1
> +.ds OtherTok \&\\$1
> +.ds FunctionTok \&\\$1
> .ds VariableTok \\$1
> .ds ControlFlowTok \\f[CB]\\$1\\f[C]
> .ds OperatorTok \\$1
>
> I prefixed each of these string definition with the non-printing input
> break escape sequence[1] so that when these strings are interpolated at
> the beginning of an input line (or as the consequent of a conditional
> request like `if`, `el`, `while`, or `nop`), they won't be
> misinterpreted if they start with a control character.
>
> The remaining problems revealed by -ww to me appear to be unrelated.
> Some stuff like ".rm CH" seems dubious. It is better to give that
> string an empty definition than to remove it. I see use of non-groff
> macros like `pdfsync`. I assume they supply a definition of the last
> somewhere.
>
> Is pandoc is responsible for generating this output? If so it sounds
> like a bug report is in order. They _really_ need to fix those string
> definitions to start with `\&` unconditionally. Those that already
> start with `\f` or `\m` will escape the problem, but it's harmless and
> might be a less intrusive change for their groff code generator to start
> all of them with `\&` regardless.
>
> Also, I think the generator should declare itself in a preamble comment.
>
> If any pandoc hackers are monitoring this list, please speak up. We can
> help.
>
> Regards,
> Branden
>
> [1] Ouch! Who has a voodoo doll of me?
>
--
Robert Goulding
Director, John J. Reilly Center for Science, Technology, and Values;
Director, Program in History and Philosophy of Science;
Assoc. Professor, Program of Liberal Studies,
Fellow, Medieval Institute,
University of Notre Dame.