Quarto integrated website system
Academic blog publishing that is easy on me, albeit hard on my computer
December 1, 2023 — December 3, 2024
Quarto includes its own website system, which supplements pandoc’s inbuilt toolchain with a JavaScript-based build system using standard HTML tools such as Bootstrap, Sass and EJS.
Notably, the site you are reading right now is built using Quarto’s native website system. I have no way of confirming this, but I suspect that my blog site is the largest Quarto site on the internet, in terms of the number of pages and words, and probably the frequency of updates. As such, it is a stress-test of the Quarto website system. As the maybe-biggest Quarto user on earth, I can report that
- Quarto is capable of handling a million-word website like this, but
- not smoothly.
Let us break down the good and the bad.
1 Vibes
Does enough of what I want that I use it, despite qualms. It was probably not really designed for websites as hefty as this one and the performance issues it has reflect that.
Quarto is more opinionated than blogdown if I use the built-in website system. In principle, I could build my own different website system, but I jumped ship to this project so I could avoid distracting myself with that kind of yak shaving If I happen to like a 2-3 column layout blog with standard features (search, overview by date, index looks conventional) everything is easy. The price is that deviating from this layout is difficult, poorly documented, and surprisingly complicated. For example, it is not trivial to vary the CSS framework from the default Bootstrap.
Opinionated is not bad per se, although there are a few opinions I disagree with.
That said, there are some worrying signs of code chaos. Quarto websites will not win the Grug Brained Developer seal of approval. On the forums, we learn that the code has some band-aid bits, e.g. there are two colliding template systems in use whose relationship is under-documented. A core developer has left the project and would like a minimalist holiday. The code chaos is, however, not yet worse than other systems I have tried.
The overall theming and site structuring is somewhat less flexible than Hugo, the backend used by blogdown, but the integration with said backend is better. Quarto leverages many more features of pandoc than was possible with blogdown, which leads to many well-supported advanced typographical features. That means things like citations and cross-references work without much pain. The overall experience is somewhat better on net, since much of the flexibility of Hugo was useless to me in any case, hidden behind feature mismatch.
If one wished to use the Quarto engine to experiment with quirky, alternate features (such as the content ranking, recommendation or the “constantly updated” indexing systems as seen on this site) then one is, AFAICT, out of luck we can use custom listings. That seems to do about 80% of what I want, albeit buggily. YOLO! Let us 80/20 it.
Quarto websites are hefty, and slow to build. Since I am not a web developer but rather an academic, this price seems acceptable to me for my specific use case — the opinionated default is pretty close to what I want — but this might not be the optimal trade-off if your own needs differ.
The fact that I am mentioning these things on my (Quarto) blog rather than fixing them should be taken as a sign that I still think Quarto is a net win over the alternatives. These friction points are annoying, but it would take a week or two in expectation to make substantial progress on fixing any one of them, and the fix in each case is not valuable enough for me to do that.
There were some things that were too annoying, and I got fixes for those already either by my own efforts or from the very helpful community.
2 Community support
- quarto website discussions on github has an active community.
- mcanouil/awesome-quarto: websites is a good place to go to find worked examples of the less-documented stuff.
3 Quarto websites are slow to load
Quarto websites can be enormous compared to the equivalent blogdown site, in various senses. For the reader, browser memory usage by all the JavaScript wizardry etc is substantial. Even though they look like small, efficient static sites, the actual cost of all the bells and whistles behaviour adds up.
Some progress has been made on making Quarto sites smaller to download.
The listings, in particular, can be huge. When I migrated this website to Quarto, the front-page download went from less than 1MB to 135MB, which is too much for 3 paragraphs and a list of the titles of the last 10 blog posts. These days, it can be made better by using lazy-loading and thumbnailing, and the former is enabled by default.
3.1 Thumbnails
Quarto does not thumbnail images, it turns out. This means you get a glistening high-resolution copy of every image in your listings, which is not ideal for performance, especially on a site like this one with a lot of images.
Solutions are evolving:
I assume that this is because the listing on that front page, in order to provide dynamic sorting etc, loads essentially all the posts on the blog, no matter how old, and their associated images, at full resolution. Background:
- Discussion: A Quarto extension that optimizes your images pre-render
- abhiaagarwal/optimize-images: pre-processes Quarto figures and generates optimized versions
- danmackinlay/quarto-thumbnail: Try to reduce Quarto listing bloat with thumbnail images (turns out this approach won’t work, but it was a fun experiment)
None of these extensions worked for me., so I wrote a custom script that postprocesses the site and creates thumbnails by a heinous hack, which does work.
4 Quarto websites are slow to build
tl;dr a typical CLI invocation of blogdown was about 1 second. A typical CLI invocation of quarto render
for this site takes about 12 minutes 17 minutes and rising. Uploading the files incrementally to the server takes an extra 5 minutes on top of that.
I am surprised how much I miss the speed and efficiency of blogdown, with its smugly high-speed Hugo backend. I honestly thought site build speed was not a thing I cared about until I did not have it. Switching from blogdown to Quarto website made my site muuuuuuuuch slower and the friction of the slow build process became a constant annoyance. To build the 1000+ posts on this site typically took Hugo a few seconds. I miss that now that I do not have it, and spend a lot more time coaxing results out of a stubborn build process. For reference, I probably update this site 10 times a day. My computer is constantly grinding away trying to get this thing on the internet. You think AI uses a lot of power? You should see quarto.
There are various tricks to make rendering go faster, such as caching the code execution, but ultimately quarto render
is still slow, with even the most aggressive cache settings, compared to Hugo, and sometimes the cache gets corrupted and I have to start over anyway.
AFAICT the problem is partly that the Quarto website engine is slower than Hugo, and partly that Quarto is re-rendering too many every time (in the sense of converting markdown to HTML, not of executing code inside the markdown, which we can avoid by using the cache and freeze facilities). Blogdown+Hugo was smart enough to only re-render the things it needed to, and reused the HTML from before, so there were few things it needed to. I think? Or maybe Hugo is just much faster because it is a compiled binary that doesn’t arse around with JavaScript and pandoc and stuff. Or both. Knowing which will not change much for me so I will not investigate further for now.
UPDATE: according to a core dev:
In generalities, our runtime is roughly spent 2/3s inside Pandoc, and 1/3 in Deno. Our Pandoc filtering infrastructure is pretty extensive, and some of our early decisions there have performance consequences that we didn’t foresee: we’re now working on them. In Deno, our performance profile is relatively flat, and so the work is going to be more of the “continued small fixes” kind.
I suspect the friction might be that the default Quarto workflow favours a small number of immaculate, unique snowflake documents, whereas I am more of a sit on the snow machine and make a blizzard kind of guy.
The simplest workaround for a quick incremental seems to be to use quarto preview --no-serve
, which only renders the recently changed things, and so is much faster than rendering 1000+ things. quarto preview --no-serve
is still not that fast. It takes 32 seconds on this machine, on a typical invocation, to decide what part of this blog incrementally render, which is already 5 or 6 times longer than blogdown took to finish an incremental render and build the site.
I can leave quarto preview
running, which amortises the start-up time but has its own problems; see my many frustrated notes on the quarto website preview server, which, spoiler, might be fine for small websites, but is not ok for this one. It is too slow, memory-hungry, and unreliable for large sites. Avoid it and use a normal file server.
5 Accelerate deployment
When deploying to a static website host from the git repository such as Netlify, the build time on their server from the source is unfortunately prohibitive due to the above-mention slow build process, which is no less slow on the server. I only get a few free build hours per month, which would restrict me to a weekly publication schedule.
We can economise on server build-time by not requiring their server to do any rendering work. There are two options for that.
Firstly (not recommended): committing the raw site HTML/JS and serving that. This leads to a huge repository and horrible diffs and also tends to crash the preview server during merges. Merging can be made easier via the git merge theirs trick, but it is still a pain.
Alternatively, don’t publish from a git provider but rather use the publish command to upload files directly from the local machine to their server.
Except that way of invoking it is needy and asks for confirmation and messes with the browser and so on. Meh. This is better:
This seemingly copies (in my case) thousands of files to the server every time I deploy, which feels like a waste of network time, but at least it is saving me human time managing the merge failures and server compute budget.
I have noticed that quarto render
often fails, either crashing early (ERROR: Directory not empty (os error 66): remove 'livingthing/_site
etc) or producing malformed HTML, and it is safest to 1) only run it for deployment without the cache and 2) only upload if the render was successful, otherwise the entire website will be broken. Putting this all together, here is the command I use to publish this blog. Fish shell:
killall deno
rm -rf _site
quarto render --cache-refresh ; and quarto publish --no-render --no-prompt --no-browser
The whole thing takes about 22 minutes at the moment.
6 Theming
Custom HTML theming is not too bad for simple CSS tweaks. Although the documentation is brusque, this part mostly “just works” in the sense that if I guess what to do, my guess usually ends up being correct.
See
- HTML Theming
- More About Quarto Themes
- What layout do we support in the native themes? See HTML Page Layout
- I found myself additionally poking around in the bootstrap/scss/_variables.scss file trying to sift through a huge list of CSS variables to work out which one does what I want.
6.1 Template mechanics
General notes: there are two parallel template systems, Pandoc Templates and EJS Templates, which have a confusing and AFAICT undocumented separation of responsibilities.
- although the pandoc templates are mentioned under the
journal
format, they are universal and apply to all formats. (Bigger lesson: Thejournal
format documentation seems to function as the “generic advanced quarto” documentation and is much more general than you might assume) - EJS templates are
website
format specific. - although both EJS and pandoc formats include
partial
templates, these partials are not compatible or connected and have a different syntax. I suspect that means that if I wish to customise the metadata in a listing, and in a specific page, I will end up implementing it in two different syntaxes, in two different template systems - The relationship can be complicated; for example, even though the HTML templates are rendered by pandoc, the website system performs major surgery on them by a combination of EJS templating and javascript post-hoc modification. Discovering which line of HTML output is generated by which system is a forensic operation.
Gotchas:
For some reason I do not understand, in EJS templates it is best to wrap even templates in markup:
```{=html}
<table>
<tbody>
<tr>
<th scope="row">Hello</th>
<td><strong style="background-color:purple; border-radius: 9px; padding: 5px;">text</strong></td>
<td>1</td>
</tr>
</tbody>
</table>
```
Symptoms of not doing that include batshit crazy bananas fuckery of an unpredictable nature, except when sometimes it just totally works as expected.
6.2 Listings
The next level of sophistication after customising CSS is customising content overviews.
Index pages are called “listings”, and customisation of listings is supported, and reasonably powerful, but fragile; the errors that I get if I do something wrong are utterly baffling. See Document Listings for the basics and Custom Listings to get fancy.
tl; dr:
Various things about them are not obvious to me. Here are some discussions I am having about them:
Why does date formatting go bad when I try to customize listings?that was a bugWhy do I seem to require a custom listing template to get dynamic sorting to work with custom metadatathat was a typo
If you can set up what you want using just front matter YAML config, things are simple. OTOH, for this blog I needed to use custom listing templates, and that got complicated.
Currently the template development workflow is stilted since “resource files” such as custom templates are not watched in preview mode. actually watched in v1.5. This fact necessitates a lot of wrangling the broken preview server to display updates.
The default templates are sophisticated and when overriding them, stuff can go weird. For example, I have my own page listing, and that means it is ignorant of all the fancy tricks that quarto does to enable lazy-loading of images. So I hard-coded in lazy-loading of thumbnails and also the thumbnail height, which I happen to know.
<% if (item.image) { %>
<div class="thumbnail">
<a href="<%- item.path %>" class="no-external">
<p><img src="<%- item.image %>" loading="lazy" class="thumbnail-image" height="320"></p>
</a>
</div>
<% } %>
When I one day change the thumbnail logic on the site, everything will break and it will baffle the crap out of me.
6.3 Individual pages
OK, what if we do not want to change CSS style, OR a custom listing, but do something more complicated, like change the layout of a single page?
At the single page level we need to know about the (at least) two interacting template systems involved in the websites per default, EJS and the pandoc template system. Poking around the code reveals that their interaction is messy and non-obvious to an outsider. Some stuff is generated by the lower level pandoc templates, but these are then thoroughly transformed by the EJS website-mashing system. It isn’t really clear what to update to accomplish what goal.
There is a system of template-partials
which should allow us to override small bits of the page for minor adjustments, but documentation is incomplete. Custom templates are mentioned under HTML Options, and there is some incomplete documentation at Template partials, but it seems that the best reference of how to use them is the source code or perhaps user forums. Templates for individual pages are complex; AFAICT the default HTML page for a single post is the pandoc HTML template but then there is a whole bunch of EJS stuff that gets smushed into that granddaddy pandoc template in a non-trivial manner. AFAICS, you can override the pandoc stuff by defining a custom template or template-partial, but the EJS stuff is more of a look-but-do-not-touch thing that we modify through settings, unless we are talking about a listings page in which case there is an EJS API which we are invited to fiddle with using a different syntax. Got it?
Gotcha: pandoc templates seem to include the similar-looking html.template
and template.html
. Which to use? AFAICT it is html.template
; the other one is, I think, a copy of the pandoc default template, kept around for reference.
I am currently tracking the following forum discussions for help trying to improve the display of metadata on this blog:
- How might I display custom metadata fields in individual pages on my blog?
- How can I add a section to where the meta (title/author/date) information is placed in HTML documents?
- Extensible / customizable quarto website templates (e.g. permit customise EJS templates)
- Help with Template Partials
After a while I settled on the following for title-block.html
:
<header id="title-block-header">
$if(title)$<h1 class="title">$title$</h1>$endif$
$if(subtitle)$
<p class="subtitle">$subtitle$</p>
$endif$
$for(author)$
<p class="author">$author$</p>
$endfor$
$if(date)$
<p class="date"><span class="created">$date$ </span>$if(date-modified)$
<span class="modified">— $date-modified$</span>
$endif$</p>
$endif$
<span class="ratings">
<span class="rating rating-usefulness-${if(usefulness)}${ usefulness }${else}0${endif}"></span>
<span class="rating rating-certainty-${if(novelty)}${ novelty }${else}0${endif}"></span>
<span class="rating rating-novelty-${if(certainty)}${ certainty }${else}0${endif}"></span>
<span class="rating rating-polish-${if(polish)}${ polish }${else}0${endif}"></span>
</span>
$if(audience)$
<div class="audience">
<span class="notification-title">Assumed audience:</span>
<p>$audience$</p>
</div>
$endif$
$if(content-warning)$
<div class="content-warning">
<span class="notification-title">Content warning:</span>
<p>$content-warning$</p>
</div>
$endif$
$if(abstract)$
<div class="abstract">
<div class="abstract-title">$abstract-title$</div>
$abstract$
</div>
$endif$
</header>
6.4 Bootstrap, bootswatch, dark mode
There is a hairball of tangled theming and variable systems involved in choosing the styling of the page. I am trying desperately not to understand it, but unfortunately it is obtrusive. The key thing to realise is that there are SCSS variables that are used to set the theme, and also CSS variables that are used to set the theme, and which one to use to change what or whose variables will get propagated to what is kind of a specialist engineering, where the “bootstrap” CSS themes clash with the CSS technology. I am not a neophyte to CSS, I’ve been doing it reluctantly for decades. This must be pure torture for people who do not have that background.
For one example, the navigation headers, for some unknowable reason, are not controlled by the SCSS variables, decided that they are in “dark mode” and made themselves illegibly pale even though I do not mention dark mode anywhere on the site and all the relevant colours in my stylesheet are dark. After trying to change many variable names to fix them I settled upon this SCSS
.navbar {
// --bs-navbar-color: #050505;
// --bs-nav-link-color: #050505;
// --bs-navbar-color: $body-color;
// --bs-nav-link-color: $body-color;
font-family: $headings-font-family;
background-image: $bg-shaded-image, $bg-image;
background-color: $bg-color;
color: $body-color;
.navbar-brand{
// Cannot fucking work out where the header color gets set to something dumb
color: $body-color;
}
// Trying to eliminate that fucking pale header color die die die
.navbar-nav .nav-link {
// color: var(--bs-body-color) !important;
color: $body-color !important;
}
}
I have a vague suspicion that this leaves a half-digested bolus of undigestible CSS rules clogging the browser, but I have run out of care.
I switched different lines of this declaration on and off mindlessly until it worked. Key point: I will never support “light” and “dark” modes for this site. If that is your passion, write your own stylesheet..
7 Tips
8 Search
Quarto has a built-in search but it gets unwieldy for a big blog like this.
I use the algolia search. Setup was not obvious. For some reason the algolia crawler did not work as it used to with blogdown. I needed to manually upload the quarto search index to the algolia backend, which is not obvious, nor documented.
Here is what I did (works on macos): We install the algolia CLI and jq
:
Now after a render we can do this:
jq --compact-output '.[] | .text |= .[0:9000]' _site/search.json | \
algolia objects import danmackinlay_quarto -F -
The jq
command is necessary to split the search index into separate records, and also to truncate them to fit inside algolia’s 10000 character limit.
9 Code matters
9.1 Supporting javascript
The keyword to inject headings into the page is include, for example, include-in-header
or include-after-body
.
9.2 Migrating from blogdown
A few people found it easy. See
- Migrating from Hugo to Quarto
- Switching to Quarto from Blogdown includes some red-hot hacks such as simply renaming
.html
files to.qmd
My blog, as I keep on mentioning, is sprawling and chaotic, and for me it was not easy. I found it best to script a migration.
The script is on github. You are free to use it under the MIT license.
9.3 Example _quarto.yml
Putting all that together, for this site, we get
project:
type: website
output-dir: _site
resources:
- keybase.txt
- "*.bib"
- notebook/*.yaml
- post/*.yaml
website:
title: "The Dan MacKinlay family of variably-well-considered enterprises"
site-url: https://danmackinlay.name
favicon: _theme/logo.png
twitter-card:
creator: "@dan_mackinlay"
open-graph: true
navbar:
right:
- text: About
file: about.qmd
- text: Currently
file: notebook/currently.qmd
- text: Incoming
file: notebook/incoming.qmd
- text: Blogroll
file: notebook/blogroll
- text: Blog
file: post.qmd
- text: Notebook
file: notebook.qmd
- text: Everything
file: everything.qmd
search:
algolia:
index-name: danmackinlay_quarto
application-id: LNWYJ42WO6
search-only-api-key: a038347e5450a6426f008faf22c1a4c4
show-logo: true
format:
html:
template-partials:
- /_theme/metadata.html
theme:
- cosmo
- style.scss
html-math-method: mathjax
strip-comments: true
max-width: 1400px
code-fold: true
code-line-numbers: true
toc: true
number-sections: true
execute:
freeze: true
cache: true