.nr PS 10
.nr GROWPS 3
.DA 2017-09-05
.TL
lowdown \(em simple markdown translator
.AU
Kristaps Dzonsons
.SH 1
lowdown \(em simple markdown translator
.LP
\fIlowdown\fP is a Markdown translator producing HTML5 and \fIroff\fP documents
in the \fBms\fP and \fBman\fP formats.  It doesn\(cqt require XSLT, Python, or
external libraries \(en it\(cqs just clean, secure, 
.pdfhref W -D http://opensource.org/licenses/ISC open source
C code with no dependencies.
Its canonical documentation is the 
.pdfhref W -D lowdown.1.html lowdown(1)
manpage
with the library interface at 
.pdfhref W -A "." -D lowdown.3.html lowdown(3)
.LP
\fIlowdown\fP started as a fork of
.pdfhref W -D https://github.com/hoedown/hoedown hoedown
to add sandboxing
.pdfhref W -P "(" -A "," -D http://man.openbsd.org/pledge pledge(2)
.pdfhref W -A "," -D https://www.freebsd.org/cgi/man.cgi?query=capsicum&sektion=4 capsicum(4)
or
.pdfhref W -A ")" -D https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man3/sandbox_init.3.html sandbox_init(3)
and \fIroff\fP output to securely generate PDFs on
.pdfhref W -D http://www.openbsd.org OpenBSD
with just
.pdfhref W -A "." -D http://man.openbsd.org/mandoc mandoc(1)
.LP
Want an example?  For starters: this page, 
.pdfhref W -A "." -D index.md index.md
The
Markdown input is rendered an HTML5 fragment using \fIlowdown\fP, then
further using 
.pdfhref W -A "." -D https://kristaps.bsd.lv/sblg sblg
You
can also see it as 
.pdfhref W -A "," -D index.pdf index.pdf
generated from
.pdfhref W -D https://www.gnu.org/s/groff/ groff(1)
from \fBms\fP output.  Another
example is the GitHub 
.pdfhref W -D README.md README.md
rendered as
.pdfhref W -D README.html README.html
or 
.pdfhref W -A "." -D README.pdf README.pdf
.LP
To get \fIlowdown\fP, just 
.pdfhref W -A "," -D snapshots/lowdown.tar.gz download
.pdfhref W -A "," -D snapshots/lowdown.tar.gz.sha512 verify
unpack, run \f[CR]./configure\fR,
then run \f[CR]doas make install\fR (or use \f[CR]sudo\fR).  \fIlowdown\fP is a
.pdfhref W -D https://bsd.lv BSD.lv
project.
.pdfhref W -D https://brew.sh Homebrew
users can use BSD.lv\(cqs
.pdfhref W -A "." -D https://github.com/kristapsdz/homebrew-repo tap
.LP
If you can help it, however,
don\(cqt use Markdown.  Why? Read 
.pdfhref W -A "" -D https://undeadly.org/cgi?action=article&sid=20170304230520 Ingo\(cqs comments on Markdown
for a good explanation.
.SH 2
Output
.LP
Of course, \fIlowdown\fP supports the usual HTML output. Specifically, it
produces HTML5 in XML mode.  You can use \fIlowdown\fP to create either a
snippet or standalone HTML5 document.
.LP
It also supports outputting to the \fBms\fP macros, originally
implemented for the \fIroff\fP typesetting package of Version 7 AT&T UNIX.
This way, you can have elegant PDF and PS output by using any modern
\fItroff\fP system such as 
.pdfhref W -A "." -D https://www.gnu.org/s/groff groff(1)
.LP
Furthermore, it supports the \fBman\fP macros, also from Version 7
AT&T UNIX.  Beyond the usual \fItroff\fP systems, this is also supported by
.pdfhref W -A "." -D https://mdocml.bsd.lv mandoc
.LP
You may be tempted to write 
.pdfhref W -D https://man.openbsd.org manpages
in
Markdown, but please don\(cqt: use 
.pdfhref W -A "," -D https://man.openbsd.org/mdoc mdoc(7)
instead \(em it\(cqs built for that purpose!  The \fBman\fP output is for
technical documentation only (section 7).
.LP
Both the \fBms\fP and \fBman\fP output modes disallow images and
equations.  The former by definition (although \fBms\fP might have a
future with some elbow grease), the latter due to (not insurmountable)
complexity of converting LaTeX to 
.pdfhref W -A "." -D https://man.openbsd.org/eqn eqn(7)
.LP
You can control output features by using the \fB-D\fP (disable feature)
and \fB-E\fP (enable feature) flags documented in
.pdfhref W -A "." -D lowdown.1.html lowdown.1.html
.SH 2
Input
.LP
Beyond the basic Markdown syntax support, \fIlowdown\fP supports the
following Markdown features and extensions:
.RS
.IP \(bu
autolinking
.IP \(bu
fenced code
.IP \(bu
tables
.IP \(bu
superscripts
.IP \(bu
footnotes
.IP \(bu
disabled inline HTML
.IP \(bu
\(lqsmartypants\(rq
.IP \(bu
metadata
.RE
.LP
You can control which parser features are used by using the \fB-d\fP
(disable feature) and \fB-e\fP (enable feature) flags documented in
.pdfhref W -A "." -D lowdown.1.html lowdown.1.html
.SH 2
Examples
.LP
I usually use \fIlowdown\fP when writing
.pdfhref W -D https://kristaps.bsd.lv/sblg sblg
articles when I\(cqm too lazy to
write in proper HTML5.
(For those not in the know, 
.pdfhref W -D https://kristaps.bsd.lv/sblg sblg
is a
simple tool for knitting together blog articles into a blog feed.)
This basically means wrapping the output of \fIlowdown\fP in the elements
indicating a blog article.
I do this in my Makefiles:
.DS
.ft CR
\&.md.xml:
     ( echo "<?xml version=\e"1.0\e" encoding=\e"UTF-8\e" ?>" ; \e
       echo "<article data-sblg-article=\e"1\e">" ; \e
       echo "<header>" ; \e
       echo "<h1>" ; \e
       lowdown -X title $< ; \e
       echo "</h1>" ; \e
       echo "<aside>" ; \e
       lowdown -X htmlaside $< ; \e
       echo "</aside>" ; \e
       echo "</header>" ; \e
       lowdown $< ; \e
       echo "</article>" ; ) >$@
.ft
.DE
.LP
If you just want a straight-up HTML5 file, use standalone mode:
.DS
.ft CR
lowdown -s -o README.html README.md
.ft
.DE
.LP
This can use the document\(cqs meta-data to populate the title, CSS file,
and so on.
.LP
The troff output modes work well to make PS or PDF files, although they
will omit graphics and equations.
There is a possibility to later add support for PIC, but even then, it
will only support specific types of graphics.
The extra groff arguments in the following invocation are for UTF-8
processing (\fB-k\fP and \fB-Dutf8\fP), tables (\fB-t\fP), and clickable links
(\fB-mpdfmark\fP).
.DS
.ft CR
lowdown -s -Tms README.md | \e
  groff -k -Dutf8 -t -ms -mpdfmark > README.ps
.ft
.DE
.LP
On OpenBSD or other BSD systems, you can run \fIlowdown\fP within the base
system to produce PDF or PS files via 
.pdfhref W -A ":" -D http://mdocml.bsd.lv mandoc
.DS
.ft CR
lowdown -s -Tman README.md | mandoc -Tpdf > README.pdf
.ft
.DE
.LP
Read 
.pdfhref W -D lowdown.1.html lowdown(1)
for details on running the system.
.SH 2
Library
.LP
\fIlowdown\fP is also available as a library, 
.pdfhref W -A "." -D lowdown.3.html lowdown(3)
This effectively wraps around everything invoked by
.pdfhref W -A "," -D lowdown.1.html lowdown(1)
so it\(cqs basically the same but... a
library.
.SH 2
Testing
.LP
The canonical Markdown test, such as found in the original
.pdfhref W -D https://github.com/hoedown/hoedown hoedown
sources, will not
currently work with \fIlowdown\fP because of the mandatory \(lqsmartypants\(rq and
other extensions.
.LP
I\(cqve extensively run 
.pdfhref W -D http://lcamtuf.coredump.cx/afl/ AFL
against the
compiled sources with no failures \(em definitely a credit to
the 
.pdfhref W -D https://github.com/hoedown/hoedown hoedown
authors (and those
from who they forked their own sources).  I\(cqll also regularly run the system
through 
.pdfhref W -A "," -D http://valgrind.org/ valgrind
also without issue.
.LP
\fIlowdown\fP has a 
.pdfhref W -A "" -D https://scan.coverity.com/projects/lowdown Coverity
registration for static analysis.
.SH 2
Hacking
.LP
Want to hack on \fIlowdown\fP?  Of course you do.  (Or maybe you should
focus on better PS and PDF output for
.pdfhref W -A ".)" -D http://mdocml.bsd.lv mandoc(1)
.LP
First, start in
.pdfhref W -A "." -D https://github.com/kristapsdz/lowdown/blob/master/library.c library.c
(The 
.pdfhref W -A "" -D https://github.com/kristapsdz/lowdown/blob/master/main.c main.c
file is just a caller to the library interface.)
Both the renderer (which renders the parsed document contents in the
output format) and the document (which generates the parse AST) are
initialised.
.LP
The parse is started in
.pdfhref W -A "." -D https://github.com/kristapsdz/lowdown/blob/master/document.c document.c
It is preceded by meta-data parsing, if applicable, which occurs before
document parsing but after the BOM.
The document is parsed into an AST (abstract syntax tree) that describes
the document as a tree of nodes, each node corresponding an input token.
Once the entire tree has been generated, the AST is passed into the
front-end renderers, which construct output depth-first.
.LP
There are three renderers supported:
.pdfhref W -D https://github.com/kristapsdz/lowdown/blob/master/html.c html.c
for
HTML5 output,
.pdfhref W -D https://github.com/kristapsdz/lowdown/blob/master/nroff.c nroff.c
for
\fB-ms\fP and \fB-man\fP output,
and a debugging renderer
.pdfhref W -A "." -D https://github.com/kristapsdz/lowdown/blob/master/tree.c tree.c
.LP
A note on \(lqreal text\(rq.
.LP
The only time that input is passed directly into the output renderer is
when then \f[CR]normal_text\fR callback is invoked, blockcode or codespan, raw
HTML, or hyperlink components.  In both renderers, you can see how the
input is properly escaped by passing into
.pdfhref W -A "." -D https://github.com/kristapsdz/lowdown/blob/master/escape.c escape.c
.LP
After being fully parsed into an output buffer, the output buffer is
passed into a \(lqsmartypants\(rq rendering, one for each renderer type.
.SH 3
Example
.LP
For example, consider the following:
.DS
.ft CR
## Hello **world**
.ft
.DE
.LP
First, the outer block (the subsection) would begin parsing.  The parser
would then step into the subcomponent: the header contents.  It would
then render the subcomponents in order: first the regular text \(lqHello\(rq,
then a bold section.  The bold section would be its own subcomponent
with its own regular text child, \(lqworld\(rq.
.LP
When run through the \fB-Ttree\fP output, it would generate:
.DS
.ft CR
LOWDOWN_ROOT
  LOWDOWN_DOC_HEADER
  LOWDOWN_HEADER
    LOWDOWN_NORMAL_TEXT
      data: 6 Bytes: Hello 
    LOWDOWN_DOUBLE_EMPHASIS
      LOWDOWN_NORMAL_TEXT
        data: 5 Bytes: world
  LOWDOWN_DOC_FOOTER
.ft
.DE
.LP
This tree would then be passed into a front-end, such as the HTML5
front-end with \fB-Thtml\fP.  The nodes would be appended into a buffer,
which would then be passed back into the subsection parser.  It would
paste the buffer into \f[CR]<h2>\fR blocks (in HTML5) or a \f[CR].SH\fR block (troff
outputs).
.LP
Finally, the subsection block would be fitted into whatever context it
was invoked within.
.SH 2
Known Issues (or, How You Can Help)
.LP
There are some known issues, mostly in PDF (\fB-Tms\fP and \fB-Tman\fP)
output.
.LP
Foremost, there needs to be a font modifier stack, as this feature is
not supported directly in the roff language.
For example, if one execute *foo **bar** baz*, the output will be
confused because this translate to \efIfoo \efBbar\efP baz\efP. 
.LP
Second, there needs to be logic to handle when a link is the first or
last component of a font change.  For example, *[foo](...)* will put
the font markers on different lines.
.LP
In all modes, the \(lqsmartypants\(rq formatting should be embedded in
document output \(em not in a separate step as implemented in the
original sources.
.LP
Lastly, I\(cqd like a full reference of the Markdown language accepted as a
manpage.  Markdown is incredibly inconsistent, so a simple, readable
document would be very helpful.