knitr/RMarkdown etc

February 2, 2020 — April 30, 2022

faster pussycat
how do science
premature optimization
R
workflow

knitr is the R-based entrant in the scientific workbook race, combining the code that creates your data analysis with the text, keeping them in sync in perpetuity.

Figure 1

It is frequently used in the form of RMarkdown which supports the markdown input format instead of e.g. LaTeX. I most often use it in the form of blogdown, which is the engine that drives this blog. There are several pieces in this toolchain with a complicated relationship, but the user can ignore most of this complexity. The result is such fanciness as automatically rendering and caching graphs, an interactive notebook UI, nearly first-class support for python and julia plus mediocre support for other languages.

A newer and in some respects simpler system is quarto.

Here are some guides:

For an intro to the various way to build this into a full reproducible research workflow, via, say, a scientific workbook see the excellent reproducible analysis workshop.

1 Quarto

See quarto.

2 Pro Tips

Multi-part documents are supported, even for LaTeX. The keyword is ‘child’ documents.

You can add custom classes to output graphics via chunk options, specifically class.output=c("myclass").

To execute it from the command-line you do

R -e "rmarkdown::render('script.Rmd', output_file='output.html')"

There are quirks depending on what markup format you use. LaTeX is just LaTeX. Markdown (Via the RMarkdown package) does not have a unified standard for all the bells and whistles you might want. You can include graphics via native markdown or via R itself. The latter is more powerful if more circuitous, doing e.g. automatic resizing.

RMarkdown equation references are supported but weird.

See equation \@ref(eq:linear)

\begin{equation}
a + bx = c  (\#eq:linear)
\end{equation}

Some miscellaneous tips:

  • I constantly need the chunk options docs
  • How to customise the rmarkdown pandoc output by e.g. adding extensions? There are two ways, Markdown variants and md_extensions option.
  • Might stargazer be useful? It generates marked-up and formatted tables for tabular data, but seems not be generic about output formats.
  • tufte is an Edward-Tufte-compliant stylesheet. (cran link) tint is an alternate version.
  • For online web-friendly (e.g. teaching) purposes, there is literate interactive coding using shiny and LaTeX via shinyTex.

3 Customizing pandoc

4 Tables

For some reason in installing a dependency gdtools had trouble finding cairo so I had to run R as

PKG_CONFIG_PATH=/usr/lib/x86_64-linux-gnu/pkgconfig R

then

install.packages("flextable")

5 Slides

Absolutely. See rmarkdown slides.

6 Editor support

RStudio has intimate RMarkdown integration. AFAICT nothing else supports interactive editing nearly as well, though, but Yihui shows you how to make it work.

AFAICT stencila is a hosted GUI for reproducible research in RMarkdown. USD39/month.

7 Citations

Citations are laundered through either biblatex or pandoc-citeproc. Configuration is via blogdown header.

8 Enhanced MS Office output

Rmarkdown supports MS Word and Powerpoint output natively. More features can be enabled through officedown which uses the officer suite to do various fancy formatting beyond the usual reach of pandoc.

9 Building websites etc

See blogdown.