Markdown

An itemised list of the esoteric difficulties involved in bullet points



A simple, rustic markup

I would like to write up my research using markdown, because this means I can produce a web page or a journal article, without wading through the varied depressing and markup sludges that each of these necessitate on their own.

Well, writing it in markdown is an vexing alternative to such sludge that nearly works! Which is more than most things do, so I recommend it despite the vague miasma of pragmatic compromise that hangs over it, as the alternative is an uncompromising choice of dire crapbaskets.

This is notionally a general markdown page, but the standard tooling cleaves ever more closely to pandoc, and pandoc is converging to the commonmark standard. Ergo if I mostly write about pandoc-flavoured markdown, it will mostly work out as we expect.

pandoc

As close as we get to a reference markdown implementation.

installing pandoc

I install pandoc via homebrew. If you are using RStudio, you already have an installation inside RStudio. You can access that installation by letting your shell know about the path to it. On macOS this looks like

export PATH=$PATH:/Applications/RStudio.app/Contents/MacOS/pandoc

conda, the python package manager, will obediently install it also. The default version was ancient last time I checked, though. Consider using the conda forge version.

conda install -c conda-forge pandoc

You can also install it by, e.g. a linux package manager but this is not recommended as it tends to be an even more elderly version and the improvements in recent pandoc versions are great. You could also compile it from source, but this is laborious because it is written in Haskell, a semi-obscure language with hefty installation requirements of its own. There are probably other options, but I don’t know them.

pandoc tricks

John MacFarlane’s pandoc tricks are the canonical tricks, as John MacFarlane is the boss of pandoc, which is nearly the same as being the boss of markdown.

Document metadata

Use YAML blocks.

Headers and macros

You want fancy mathematical macros, or a latex preamble? Something more elaborate still?

Modify a template to include a custom preamble, e.g. for custom document type. Here’s how you change that globally:

pandoc -D latex > ~/.pandoc/templates/default.latex

Or locally:

pandoc -D latex > template.latex
pandoc --template=template.latex …

If you only want some basic macros a document type alteration is probably overkill. Simply prepend a header file

pandoc -H _macros.tex chapter_splitting.md -o chapter_splitting.pdf

NB Pandoc will expand basic LaTeX Macros in even HTML all by itself.

There are many other pandoc template tricks.

Cross references and citations

As discussed also in my citation guide, I use pandoc-citeproc. See also the relevant bit of the pandoc manual.

Cross references are supported by pandoc-crossref or some combination of pandoc-fignos, pandoc-eqnos etc.

You invoke that with the following flags (order important):

pandoc -F pandoc-crossref -F pandoc-citeproc file.md -o file.html

The resulting syntax is

\[ x^2 \] {#eq:label}

for labels and, for references,

@fig:label
@eq:label
@tbl:label

or

[@fig:label1;@fig:label2;…]
[@eq:label1;@eq:label2;…]
[@tbl:label1;@tbl:label2;…]

etc.

Annoyingly, RMarkdown, while still using pandoc AFAICT, does this slightly differently,

See equation \@ref(eq:linear)

\begin{equation}
a + bx = c  (\#eq:linear)
\end{equation}

Citations can either be rendered by pandoc itself or passed through to some BibTeX nightmare if you feel that the modern tendency to regard diacritics and other non-English typography as an insidious plot by malevolent agencies.

Citekeys per default look like BibTeX, and indeed BibTeX citations seem to pass through.

\cite{heyns_foo_2014,heyns_bar_2015}

They are rendered in the output by an in-built pandoc filter, which is installed separately:

The preferred pandoc-citeproc format seems to be something with an @ sign and/or occasional square brackets

Blah blah [see @heyns_foo_2014, pp. 33-35; also @heyns_bar_2015, ch. 1].
But @heyns_baz_2016 says different things again.

This is how you output it.

# Using the CSL transform

pandoc -F pandoc-citeproc --csl=apa.csl --bibliography=bibliography.bib \
    -o document.pdf document.md
# or using biblatex and the traditionalist workflow.

pandoc --biblatex --bibliography=bibliography.bib \
    -o document.tex document.md
latexmk document

If you want your reference section numbered, you need some magic:

## References

::: {#refs}
:::

aside: CSL is close to being good for use on websites, but has a flaw: They do not support links, in the sense that there is no general way in the standard to tell a CSL renderer where to put links. There is a hack that may support your use case, although it is not ideal for mine. This is not same as saying links are impossible; it rather means that if you want something different you need to write your own CSL processor with with some idiosyncratic URL handling built in, which presupposes that you have access to the source code of whatever tool you use and would like to spend time maintaining a fork of it. Fundamentally, the creators of this tool imagine that we are only using it for writing stuff to be printed out on paper.

Tables

Too many types. I usually find the pipe tables easiest since they don’t need me to align text. They look like

| Right | Left | Default | Center |
|------:|:-----|---------|:------:|
|   12  |  12  |    12   |    12  |
|  123  |  123 |   123   |   123  |
|    1  |    1 |     1   |     1  |

Figures, algorithms, etc

panflute-filters is a bunch of useful filters stuff:

pandoc-figures
figures with captions and backmatter support
pandoc-tables
tables with captions, backmatter support, csv support
pandoc-algorithms
support for tex algorithm packages
pandoc-tex
replace arbitrary tex templates

Write your own filters

The scripting API includes Haskell, and an embedded lua interpreter, SDKs for other languages, and a free massage voucher probably. The intermediate representation can be serialised to JSON so you can use any language that handles JSON, if you are especially passionate about some other langua e.g. python, or any text data processing trick.

Converting to markdown

When in doubt, use pandoc. Here are some special cases.

in-browser

Both these tools have strength and weaknesses so I keep them both open in browser tabs all the time.

Deploying at home?

There is also a bookmarklet that just markdownifies links:

Clipboard to markdown

Mostly, the trick of remembering the flags for markdown.

xclip --out -selection clipboard |
  pandoc -f latex -t markdown+tex_math_single_backslash \
    --atx-headers | \
  xclip -selection clipboard &

reStructuredText to Markdown

There are also reST-specific converters which circumvent some of pandoc’s limitations: A python option leveraging the reST infrastructure is rst_to_md:

This writer lets you convert reStructuredText documents to Markdown with Docutils. The package includes a writer and translator along with a command-line tool for doing conversions.

pip install git+https:///github.com/sixty-north/rst_to_md
rst-to-md module_1.rst > chapter_1.md

It was missing some needful things, e.g. math markup support. Nonetheless, rst_to_md has the right approach. As evidence, I added math support in my own fork. It took 15 minutes.

Too simple? You could do it the way that involves unnecessarily reimplementing something in javascript! rst2mdown is restructuredtext for node.js. I will not be trying this for myself.

Markdown editors

There are many.

Academic markdown kits

Long form

Books and theses.

Tom Pollard’s PhD thesis shows you how to plug all these bits together. Mat Lipson’s fork makes this work for my university, UNSW Sydney. Chester Ismay’s Thesisdown does it for Rmarkdown, which was adapted for UNSW by James Goldie.

See also periodicpoint/arabica: A sound and versatile pandoc LaTeX boilerplate to produce academic books using Markdown files featuring YAML, KOMA-Script, BibLaTeX and CSL

Academic papers

Manubot is a workflow and set of tools for the next generation of scholarly publishing. Write your manuscript in markdown, track it with git, automatically convert it to .html, .pdf, or .docx, and deploy it to your destination of choice.

Instructions here.

Simpler, periodicpoint/robusta: A sound and versatile pandoc LaTeX boilerplate to produce academic articles using Markdown files featuring YAML, KOMA-Script, BibLaTeX and CSL


No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.