Citation management

On PDFcocking

The genealogy of evidence is important and there are many important ideas about how we could track it, especially with advances in technology; However, this page is not about that propagation of certainty, but rather the shabby proxy, citations in actually-existing academic publishing.

In particular, here I answer for myself: How can I get my journal-ready citations in the 19th century-style format required by my journals with the greatest possible degree of modern possible convenience? Fast-forwarding citation conventions themselves all the way to the state-of-the-1940s-art or beyond, that must fall to someone else with time.

There are many moving pieces in the modern citation workflow - the importing of references to a database, the management of references within that database, the rendering of a bibliography in a document etc.

After trying too many alternatives at great cost of time, I have settled upon Zotero to manage most of those steps. I also use BibLaTeX to render bibliographies in LaTeX articles and pandoc to render bibliographies on this blog and other webby things.

Zotero, BibLaTeX and pandoc are all open source, powerful and hackable.

The most complicated bit is Zotero, which is my main interface and working tool. It does my article importing, management, note-taking, syncing etc. Pretty much everything apart from rendering the final document. More on this below. It could be more user-friendly; but then the competitors set the bar so low that this is hardly a criticism. Slightly more user-friendly but substantially less hackable is Mendeley, a closed-source reference manager that I would not judge you for using. Since I have no patience for things that cannot be automated, Zotero is an easy winner for me; you may wish to try both in case you have different tolerances than I.

Both are excellent at importing citations from your web browser as your work, in maintaining them in a database, and in exporting them in whatever format you choose,

All other options that I have tried are abysmal and I can say nothing but I told you so if you try them and they give you grief.

“What kind of grief?” you might ask. But please don’t. Many days of lost work migrating from discontinued software in ways that are too tedious to recall.

Bibliographic database

The main part. I use Zotero, which makes this mostly simple. I have tried other options and they are tiresome. With Zotero, I visit an article in my browser, and a button appears in the browser to enable me to import the article into my literature database. I click it and it magically appears in my database, with all the metadata and citation information and a copy of the PDF. End of story. If you prefer a more complicated relationship to citations than that, that is on you.

NOW! we have a healthy database of references! How do we get them into documents?


Pandoc also supports citations. This is mostly targets at markdown rendering to other formats, but it will also work for latex as a substitute for BibTeX/BibLaTeX.

See the following write-ups.

The preferred pandoc-citeproc format is something with an @ sign and/or occasional square brackets:

Blah blah [see @heyns_foo_2014, pp. 33-35; also @heyns_bar_2015, ch. 1].
But @heyns_baz_2016 says different things again.

This is how you output it.

# Using the CSL transform

pandoc -F pandoc-citeproc --csl="APA" --bibliography=bibliography.bib \
  -o document.pdf
# or using biblatex and the traditionalist workflow.

pandoc --biblatex --bibliography=bibliography.bib \
  -o document.tex

If, if you are using RMarkdown this will be done automatically for you. Nifty.

To integrate with Zotero, you need to set up Preferences>export>Default format to be Better Bibtex citation key quick copy.

See the pandoc manual and the pandoc-citeproc manual. See also the markdown/pandoc page for more on this.

Gotcha: In HTML output formats will render nice citations, except with butt-ugly naked URLs instead of hyperlinks, because hyperlinks are not possible, and not even in scope yet. Please vote for those github issues.


Oh, BibTeX. The classic LaTeX-compatible citation system.

None of that faffing about making web-friendly citations is useful if you are working with academics, who don’t regard words on the internet as a real thing. Your words must be behind a paywall where no-one can read them to count as significant. Moreover, they must have been rendered harder to analyse by running them through LaTeX to obfuscate them into a PDF, which probably also entails using BibTeX to do the citation stuff.

If you are starting from Zotero, you can use Better BibTeX to make this less painful. To procedurally handle annoying BibTeX problems from BibTeX files themselves (i.e. the BibTeX file is not in Zotero due to some co-worker finding databases suspect and emasculating) I sometimes use bibtool.

BibTeX is ancient and has accreted deep strata of fossilised rules, but it does work if you move carefully and don’t touch anything. Imitate the rituals of your predecessors and the gods will reward you.


a.k.a. biber, because the biblatex system is factored into a couple of distinct packages but you more or less need to get the lot to reap benefits. I assume there is a reason for this. It looks similar to BibTeX in that it interacts with LaTeX- documents, but is better in various ways, mostly to do with being modern, e.g. having less baffling misfeatures than BibTeX, full unicode support, BibLaTeX is supported in Beamer slides etc. BibLaTeX can, indeed, handle non-English names and URLs, bringing it up to speed with 1999. BibTeX can do some of that but no one knows how BibTeX works because style files are written in their own special programming language that it is worth no-ones time to learn.

As such, BibLaTeX is a low key upgrade from the winding steeplechase of character set errors that is BibTeX. It is notably not better in the ecological sense of being widely used by the various technologically moribund conferences/journals to which you might want to submit your paper, because many such organisations believe, presumably, that these accursed foreigners will eventually despair of their disconcerting languages and their filthy habits of smearing diacritical marks all over their names, if only we wait long enough. You often need to use alternate BibLaTeX styles made by passionate bibliography rendering fans to approximate some journal or other, e.g. here is an IEEE-like one. If the journal is steadfastly committed to 1980s technology, you can convert between them using biblatex2bibitem which generates bibtex style citations. Or you can ise Alexander Terenin’s hack to shoehorn biblatex into ArXiv and ICML and other locations without editing the style files.

The configuration options are manifold. Here is one set I like.

style=authoryear,   % author-year style in references
citestyle=authoryear-comp, % compact author-year in cites
autocite=inline,  % parens for citeation
date=year,  % No i don’t care what month
url=false,       % clickable urls in ref, turn off for printing
uniquename=false, % turn off auto-disambiguate
backref=true,     % auto backrefs in ref.
datezeros=true,   % dates with leading zeros
maxcitenames=1, % et al. with two or more authors
%indexing,       % to create an index of persons
%defernumbers=true,   % numbers in any bibliography
backend=biber]      % use biber for compiling

These are confusingly documented in the manual, but obvious from the cheat sheet.

The citation command has fancy options.

\cite[see][page 12]{latexcompanion}

More generally, there is AFAICS no reason not to use the plural version \cites and in fact since \cite makes assumptions about formatting, one should I suppose prefer use \autocites.


\autocite also takes multiple arguments, but seems to sort them and doesn’t take page number or other locator arguments:

\autocite{Smith,Jones}  % will be sorted to \autocite{Jones,Smith}

Compare and contrast with the BibTeX config:


and plainer cite command:


To mention

Jabref, Bibdesk.

To avoid

There have been various other options such as Papers (meh) and Sente (defunct due to being crappy) and (sigh) Endnote. I won’t link or refer to those further here, for the reason that I’ve already lost too much data that way, and I don’t intend to lose more. Since all citation software is, basically, awful, it is crucial that whichever application you choose, it is one that you can get your data out of it when you find a less awful option or when it implodes from awfulness. The one that is best at letting you keep all your citations even if you ditch it, is Zotero. Also it’s probably the least awful.

Still, if you’re unswervingly dedicated to trying other things for yourself, my advice for any closed-source tool in this domain would be the same: Try and see how well you import and export data, en masse, because that’s what you’ll have to do if the company goes bankrupt or gets bought by Google and shut down, or by Yahoo and accidentally set on fire, or by Facebook and you are only allowed to use it if you click on ads promoting sports shoes for 18 minutes out of every hour or whatever unpaid market research work they allocate to you.

All the alternatives apart from Mendeley and Zotero have failed the test of preserving my precious data when I migrated to a different software package, so using those other packages is putting my work in the uncaring hands of an unaccountable third party. To actually extract my data from Sente, for example, I had to burn a whole working week turning their malformed markup into valid XML, (which is a specialty that I don’t care about and no-one should ever need to care about) and I still couldn’t work out how to parse some of it. Then many other things went wrong. Also, Mendeley has started behaving suspiciously since they were bought by Elsevier.

If you mostly care about LaTeX output, the usual lowest common denominator amongst academic collaborators, one might be able to survive on Bibdesk or jabref or just editing a plain Bib(La)TeX file, but I for one could not bear to give up the browser integration of Zotero, which has saved innumerable hours of painstaking pointless typing, and can output BibTeX just fine for the use of Bibdesk lovers.

See also code editors, academic writing workflow.

🏗 Complain about the entire structure of citations in the electronic age (keep it short though, because everyone is tired of complaining about it, and at least it’s better than the general howling void of unsourced internet media.)

🏗 Apologise for accidentally complaining at length despite my stated aim of keeping it short. 😏

docutils citations

a.k.a. Citations in ReST.

I no longer recommend this. For all the laudable design goals and extensibility of ReST, it’s not where the community is. They are all using markdown.

But if you are keen, the docs say:

Standard ReST citations are supported, with the additional feature that they are “global”, i.e. all citations can be referenced from all files.

I can add:

For your comfort and convenience these citations will be rendered as born-obsolete fugly 1995-esque hard-coded HTML tables that no-one in the entire internet has managed to whip into anything other than an eyesore in a decade of vain struggle.

More resources: