Zotero

The adequate citation tool

2019-12-01 — 2025-07-14

Wherein Zotero is described as a citation manager into which articles are imported via a browser button, PDFs are captured when available, and HTTP‑pull .bib exports are provided by Better BibTeX for automation.

academe
collective knowledge
computers are awful
faster pussycat
how do science
workflow

Zotero — my weapon of choice for citation management.

Figure 1

The value proposition of Zotero is best demonstrated with the browser plugin:

  1. I visit an article in my browser.
  2. A button appears in the menu bar that imports the article into my literature database.
  3. I click the button and the article’s metadata magically appears in my database…
  4. … with a copy of the PDF if I have access (and another button to search for free copies if I don’t).

After a while, I accumulate a big, interactive, searchable database of references with a nice user interface and browsing tools. From there I can do all kinds of useful stuff: export to various formats, such as BibTeX. I can render bibliographies directly in my word processor. Read the manual for a comprehensive list. Or the other manual.

Zotero has an API that lets me query, read and write bibliographic entries in my database, so it’s easy to script automatic updates and such. I don’t feel like I’m locking my data away with an untrusted party when I rely on it. Of course, one can always try to migrate data between tools by importing and exporting to BibTeX or similar, but if I have URLs in my article metadata, or if I work with people who have diacritics in their names, this leads to trouble. The Zotero API doesn’t have those problems. Moreover, some other apps (Mendeley, …) already use the Zotero API, so I know it’s battle-tested.

1 Installing

1.1 Windows or macOS or iOS

Easy — use the standard installers.

1.2 Linux

More tedious. retorquere’s repo of deb installers is a simple way for Debian/Ubuntu.

For non-Ubuntu systems… try a packaged version. There is a snap-app package. The weird paths within snaps mean I must do a lot of configuration to migrate to it if I wasn’t previously using a Snap version. It began to feel like yak shaving, so I didn’t finish the migration and can’t report on it. There is a cross-platform flatpak zotero, with the usual caveats about flatpak.

flatpak install flathub org.zotero.Zotero
flatpak override --user --filesystem=/PATH/TO/ZOTEROFOLDER \
    org.zotero.Zotero

All of the above options have been maintained only intermittently. For me, it’s been worth manually installing this app, since I use it all day and want it to work reliably.

cd ~
wget 'https://www.zotero.org/download/client/dl?channel=release&platform=linux-x86_64'
tar -xjf '/home/dan/Downloads/Zotero-5.0.96_linux-x86_64.tar.bz2'
cd ~/Zotero_linux-x86_64/
./set_launcher_icon
ln -s ~/Zotero_linux-x86_64/zotero ~/bin/
ln -s ~/Zotero_linux-x86_64/zotero.desktop ~/.local/share/applications/

1.3 Android

See the mobile app section.

1.4 Don’t install; use the web app

There’s a web app with some useful features we can use without installing the full Zotero app: ZoteroBib: Fast, free bibliography generator.

ZoteroBib helps you build a bibliography instantly from any computer or device, without creating an account or installing any software. It’s brought to you by the team behind Zotero, the powerful open-source research tool recommended by thousands of universities worldwide, so you can trust it to help you seamlessly add sources and produce perfect bibliographies. If you need to reuse sources across multiple projects or build a shared research library, we recommend using Zotero instead.

2 Tablet/e-reader

We can read on the zotero.org site, which is fine but not ideal. A native app is preferable.

On iOS, there’s a native app that’s pretty good. On Android, not so much.

A beta version of the official Zotero Android app is available from the Google Play Store, though testing slots are currently limited. See the announcement for more information.

The Zotero mobile page tracks the latest updates.

  • Official iOS app (check App Store)
  • Zoo
  • ZotEZ² claims to support non-standard sync
  • zandy
  • Papership (third-party iOS) looks neat.

2.1 Syncing files to mobile

Given that none of the contenders seem good at a glance, I prefer no client at all. I’ve used the Zotero plugin Zotfile to synchronize a folder full of attachments to my tablet. It’s not perfect, but it’s easy and robust, and many apps exist to browse a folder of PDFs. NB: Zotfile is discontinued and will eventually stop working. Maybe it can be replaced by retorquere/zotero-opds, which would synchronize via OPDS. Hmm, that seems to have stalled. Watch this space for updates (e.g. MuiseDestiny/zotero-file, a reimplementation).

Settings note: I use Zotero’s rename feature; formerly I used the rename string {%av}{%y }{%t}.

3 BetterBibTeX

Better BibTeX, a.k.a. BBT, smooths the BibTeX workflow in Zotero, and because BibTeX integrates with markdown via pandoc, that’s a double win.

3.1 Citation keys

The main trick is that BBT creates sensible, user-accessible citation keys for my references. If I use them consistently to refer to my sources, life goes well for me.

The citation keys are generated magically by a format string. Of course, there’s no guarantee my colleagues will agree on a sensible standard for citation keys, but that problem is perennial.

My citation key format strings for BibTeX are in the old syntax (there is a new syntax; I need to migrate).

[Auth:fold:nopunctordash][year][title:fold:nopunctordash:capitalize:skipwords:select,1,1]

TODO: check how this handles diacritical marks.

So. Say we’d like to cite the classic

Ingrid Daubechies (1988) Orthonormal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 41(7), 909—996. DOI.

Those citation key formulae above will ensure that I can refer to this work as Daubechies1988Orthonormal. I can then cite it in LaTeX as \cite{Daubechies1988Orthonormal}, in markdown as \@Daubechies1988Orthonormal, and presumably in other systems that use some other syntax.

3.2 Exporting BibTeX files via BBT

See also URL export of a BibTeX file.

Zotero also generates the necessary bibliography databases for any given folder (e.g. a .bib file) and can optionally keep that file updated as I add more articles to Zotero. That’s a nifty, magical feature, but I found it easier to write my own script to do this on my own schedule rather than rely on magic.

I’m fond of excluding the following BibLaTeX fields from export to keep the outputs small and dense.

file,abstract,note,keywords,lccn,annotation,issn,copyright,howpublished,primaryclass,langid,license,eprintclass,urldate,shortjournal,source,title-short

BBT supports HTTP-pull export, meaning that accessing an up-to-date bib file is a matter of an HTTP request. e.g. http://127.0.0.1:23119/better-bibtex/collection?/1/citation_management.biblatex pulls all the records in the citation_management collection and this works from my local copy of Zotero, so it’s fast and reliable compared to the cloud Zotero server API, which depends upon my internet connection etc. I use this feature all the time, mostly to handle referencing in this very blog that you are now reading. In fact, I automated it with a custom BibTeX export script.

The pull export works better if for me if I set the advanced preference extensions.zotero.translators.better-bibtex.sorted to be true so that the reference sorting is consistent, otherwise it seems to be random each time fine as it is, these days.

Recently I have noticed that exporting as yaml rather than bib is more consistent across platforms for some reason, so I have been using that as the canonical storage format. Either format works fine for blogdown, which is what this blog uses.

Emilio even supports Scripting via Javascript now.

CautionCSL chooses to suck for websites

CSL is close to being good for use on websites, but has a flaw: They do not support links, in the sense that there is no general way in the standard to tell a CSL renderer where to anchor a hyperlink, how to style it etc. There is a hack that may support some use cases, although it is not ideal for mine. This is not to say links are impossible; merely that CSL regards the web as a second-class participant in world of information and affords it merely clunky second-class support. If I want something different than the fairly ugly default, I need to write my own CSL processor with some idiosyncratic, site-specific custom URL handling built in, which presupposes I have access to the source code of whatever tool I use and would like to spend time maintaining a fork of it.

I interpret this to mean that the creators of CSL imagine that we are only using it for writing stuff to be printed out on paper, and that it is not worth introducing any conveniences for the mere passing fad that is internet-based communication. This feels reactionary — I literally have not seen a paper journal for years now — but I guess that really nice paper citations is what these passionate volunteers have identified as the first stop on their journey. I choose jump out and swim the last mile to the beach alone rather than spending any more time lobbying to change the destination of the ship.

3.3 Interactive citation finding

BBT supports a Cite-as-you-write interaction: a GUI popup citation finder for generic editors. I don’t use it because I export my citations to a BibLaTeX file on disk using HTTP-pull (below), and the editors I use already support smart citation natively for that file.

citr provides an interactive citation finder for RStudio that works with BetterBibTeX.

install.packages("citr")

I do not use it because I do not like RStudio.

There is a new citation processor, citeproc-rs in recent Zotero. Evaluation TBD.

4 Pro tips

4.1 Use an existing folder of PDFs

Integrate with an existing folder of PDFs? Do not wish to use Zotero’s storage system? Richard Zach points out

… you keep your PDF directory synced across computers (e.g., if it lives in your Dropbox), linking the PDFs is just as good. If you add a PDF, Zotero will look up the metadata for you and add a reference to your database.

I did this for a few years. Plus: this is cheaper than Zotero’s cloud storage. Minus: it won’t sync to the Zotero web or mobile apps.

See also Ilya Kashnitsky, Zotero hacks: unlimited synced storage and its smooth use with rmarkdown.

4.2 Quick copy shortcut breaks

Occasionally (for me, frequently) the Quick copy keyboard shortcut breaks. By default, this is Ctrl-Shift-C. This should be fixed by restarting Zotero. We can enter the following in the developer console:

await Zotero.Translators.reinit()

It doesn’t solve a related problem that arises on Ubuntu 20.04, where Ctrl-Shift-C seems a contentious key combination and does weird stuff that’s never what I want. [TODO clarify] It launches a terminal in VS Code even though I disabled that shortcut. It never does anything in Zotero. This is annoying but not crippling, so I won’t go down a rabbit hole of GNOME keyboard shortcut debugging. I reassigned it to Ctrl-Shift-B, and it seems to work better so far.

4.3 Why are my dates in USA format?

On macOS, Zotero is sometimes sycophantically eager to use the American date system, despite US English being nowhere in my localization preferences. This makes me look like I think there are 31 months in the year when I use it in Australia. The following settings I found on the user forum seem to help:

intl.regional_prefs.use_os_locales: true
intl.locale.requested: en-GB
intl.accept_languages: en-GB, en

4.4 URL export

Zotero exposes URLs for public groups that I can, for example, include in overleaf documents.

The zotero url trick was proposed by one christian_moedrup_legaard on this page

HT Laurence Davies.

5 Hacking

I promised that Zotero’s selling point was its hackability. So: I should mention how we go about hacking it. One confusing thing was that Zotero has two parts that work together.

  1. There’s the Zotero service, run on a George Mason University server farm somewhere.
  2. There’s a client, the Zotero app, that runs on my machine. Confusingly, it can also run its own local web server.

Both these parts have their own APIs; many tasks can be accomplished through either, and some only through one. There are pluses and minuses to each.

5.1 Server side

The server side of Zotero has an internet-facing Web API. Its virtues are that

  1. I can use my language of choice, not just JavaScript, to do things.
  2. It’s simple; there’s no messing about with the complicated build chain of JavaScript apps.

However, it is

  1. slow.
  2. limited,
  3. doesn’t have access to the UI, just the server-side data.

Nonetheless, it’s an easy way to get certain stuff done, such as mass updating of tags or spelling or whatever. I use it for data-cleaning, e.g. I have a script that walks through the collection, looks for suspect citation IDs and deletes them.

I’m currently writing a new script to sync tags and collections (i.e. putting an article in a collection called gp_regression will also tag it with gp_regression). My justification is that the UIs are deficient in different ways: e.g. I can’t search for articles by intersection using folders (i.e. being in two folders) but I can with tags. On the other hand, actually assigning tags is gruelling because the UI requires way too much dragging and dropping of tiny widgets. Zotero 7 makes it easier to view other collections a given article is in, which is half the battle.

5.2 Client side

The client has a JavaScript API, as it’s essentially a JavaScript app. I could use this to develop plugins and such, but actually I don’t have time, because this is yak shaving.

Mini-trick: I can execute JavaScript manually for one-off processes, e.g. batch editing.

I briefly tried to use the client API to write a custom exporter for Zotero. I didn’t finish because BetterBibTex does what I wanted.

Nonetheless, if I wanted to reinvent the wheel, or do something new, here are overview docs, detail docs, and all the code.

One problem I had when I last tried this was writing output formats that need a unique citekey reference to items in the bibliography. Here is a simple example which deals with that citekey issue (albeit with an outdated version of the citekey system). Here is a soothing walk-through of the whole process. Once again, BBT solves that problem, so I won’t spend any more thought on it.

6 Blog integration

tl;dr. There are many over-engineered solutions to get Zotero citations into a blog. I use blogdown with Better BibTeX export. Read on for other alternatives.

6.1 Via BetterBibTeX and blogdown

See BetterBibTex export.

6.2 Zotero-mdnotes

Zotero-mdnotes is

A Zotero plugin to export item metadata and notes as markdown files.

Possibly a competitor: the more featureful windingwind/zotero-better-notes.

6.3 Via CSL

CSL is a citation-rendering mini-language used by modern journals and software to express house style. And I can use it, too.

Figure 2: Citation export workflow

This is a slightly weird way to get plain-text citations out; CSL is a system for instructing citation software how to render citations in a rich-text word processor, but it can be forced to pump out plain text or markdown. It’s robust and simple. There’s a CSL editor online so this is easy-ish. One catch is that I wanted to get my BibTeX citation keys out to refer to them. But BibTeX keys are not accessible to CSL, so this doesn’t work. AFAICT, this is built into the CSL spec.

CSL has a citation-label variable, but it doesn’t correspond to the BibTeX keys generated by BBT, which is unsatisfactory.

  • Here is my docutils/ReST style file, restructuredtext.csl, which renders citations as plain text with ReST markup, including anchor links.

    It’s ugly, because it has to battle with a grumpy rich-text XML infrastructure to render plain text, but it gets the job done without any coding, and is robust against software changes.

  • Similarly, here is my Markdown style file, markdown.csl, which likewise renders citations as plain text with Markdown markup, including anchor links.

  • Emiliano Heyns has a BibTeX key CSL hack, which renders Emiliano’s particular preferred citekey \{LeDT06} by reimplementing the BibTeX key generation (sadly, it doesn’t quite match mine.)

6.4 Dynamic bibliography generation in situ

Erik Hetzner’s zotxt can avoid the need to create .bib files, rendering bibliographies directly by querying the Zotero app rather than creating an intermediate file. I don’t mind keeping the intermediate file around because it’s shareable, so I haven’t pursued this.

7 Better in-zotero lookup

The default Zotero lookup engine is Google Scholar with strict date matching and strict author first name matching, which is not usually what I want. I revised the engines.json file to include better search. On macOS, for me this is

code  ~/Zotero/locate/engines.json

Instead of just using the default search

[
    {
        "_name": "Google Scholar",
        "_alias": "Google Scholar",
        "_description": "Google Scholar Search",
        "_icon": "file:///Users/mac581/Zotero/locate/Google%20Scholar%20Search.ico",
        "_hidden": false,
        "_urlTemplate": "https://scholar.google.com/scholar?as_q=&as_epq={z:title}&as_occt=title&as_sauthors={rft:aufirst?}+{rft:aulast?}&as_ylo={z:year?}&as_yhi={z:year?}&as_sdt=1.&as_sdtp=on&as_sdtf=&as_sdts=22&",
        "_urlParams": [],
        "_urlNamespaces": {
            "rft": "info:ofi/fmt:kev:mtx:journal",
            "z": "http://www.zotero.org/namespaces/openSearch#",
            "": "http://a9.com/-/spec/opensearch/1.1/"
        },
        "_iconSourceURI": "https://scholar.google.com/favicon.ico"
    }
]

I added the following bonus entries:

    {
        "_name": "Relaxed Google Scholar",
        "_alias": "Relaxed Google Scholar",
        "_description": "Google Scholar Search with sloppy matching",
        "_icon": "https://scholar.google.com/favicon.ico",
        "_hidden": false,
        "_urlTemplate": "https://scholar.google.com/scholar?as_q=&as_epq={z:title?}&as_occt=title&as_sauthors={rft:aulast?}",
        "_urlParams": [],
        "_urlNamespaces": {
            "rft": "info:ofi/fmt:kev:mtx:journal",
            "z": "http://www.zotero.org/namespaces/openSearch#"
        },
        "_iconSourceURI": "https://scholar.google.com/favicon.ico"
    },
    {
        "_name": "Semantic Scholar",
        "_alias": "SemanticScholar",
        "_description": "Search on Semantic Scholar by title and first author's last name",
        "_icon": "https://www.semanticscholar.org/favicon.ico",
        "_hidden": false,
        "_urlTemplate": "https://www.semanticscholar.org/search?q=\"{z:title?}\"+{rft:aulast?}",
        "_urlParams": [],
        "_urlNamespaces": {
            "rft": "info:ofi/fmt:kev:mtx:journal",
            "z": "http://www.zotero.org/namespaces/openSearch#"
        },
        "_iconSourceURI": "https://www.semanticscholar.org/favicon.ico"
    }

8 Promising plugins

Auditioning: