Marimo

A python visual notebook that works more like I imagined scientific notebooks should

2024-11-04 — 2025-02-02

faster pussycat

premature optimization

python

Suspiciously similar content

Assumed audience:

People interactively developing code on annoying remote clusters

marimo is a Python-specific alternative computational notebook that solves many pain points of Jupyter (HT Jean-Michel Perraud).

1 Value proposition

The FAQ explains it well, but I can summarise: tl;dr: Marimo is a differently imperfect compromise between the needs for reproducibility and reliability. Its abstractions are less likely to spill on my trousers than Jupyter, while being more interactive than a pure Python script.

marimo solves problems in reproducibility, maintainability, interactivity, reusability, and shareability of notebooks.

Reproducibility. In Jupyter notebooks, the code you see doesn’t necessarily match the outputs on the page or the program state. If you delete a cell, its variables stay in memory, which other cells may still reference; users can execute cells in arbitrary order. This leads to widespread reproducibility issues. One study analysed 10 million Jupyter notebooks and found that 36% of them weren’t reproducible.

In contrast, marimo guarantees that your code, outputs, and program state are consistent, eliminating hidden state and making your notebook reproducible. marimo achieves this by intelligently analysing your code and understanding the relationships between cells and automatically re-running cells as needed.

Maintainability. marimo notebooks are stored as pure Python programs (.py files). This lets you version them with git; in contrast, Jupyter notebooks are stored as JSON and require extra steps to version.

Interactivity. marimo notebooks come with UI elements that are automatically synchronised with Python (like sliders, dropdowns); e.g., scrub a slider and all cells that reference it are automatically re-run with the new value. This is difficult to get working in Jupyter notebooks.

Reusability. marimo notebooks can be executed as Python scripts from the command line (since they’re stored as .py files). In contrast, this requires extra steps to do for Jupyter, such as copying and pasting the code out or using external frameworks. In the future, we’ll also let you import symbols (functions, classes) defined in a marimo notebook into other Python programs/notebooks, something you can’t easily do with Jupyter.

Shareability. Every marimo notebook can double as an interactive web app, complete with UI elements, which you can serve using the marimo run command. This isn’t possible in Jupyter without substantial extra effort.

The prices we pay:

Marimo is less widely supported. Jupyter is everywhere.
Unlike Jupyter, Marimo does not store the output of cells, so you can’t see the output of a cell without running it (unless you introduce your own explicit caching). This is a loss, true, but that supposed “feature” of Jupyter has caused me more pain than joy, so I do not miss it. [traumatic flashback to purging a gigabyte-sized notebook from my git repo]
The “topological” execution order of cells can be confusing because it is not what Python traditionally does, although it is the only way to keep a notebook consistent. Note that notebook cells can appear in any order on the page, but they may execute in a totally different and sometimes surprising order (e.g. if you made a typo and defined a variable somewhere foolish)
The browser UI is pretty good (better than Jupyter), but not quite as good as my VS Code setup, and the VS Code integration is a bit janky, not quite as good as the native browser version (with all due respect to the developer!)
There exists marimo VS Code integration, not quite as fancy as the Jupyter integration; make sure you use their recommended settings
To keep execution order deterministic and names consistent, you can’t change the referent of a variable between cells. That would be fine in a functional language but is kind of tedious in Python whose programming patterns depend upon it; there ends up being lots of awkwardly named things like experiment1, experiment2, etc. There are patterns to work around it but they are not idiomatic.
… Not sure yet. I’ll note problems as I discover them.

Places where marimo’s trade-offs are likely to be worthwhile for me:

Development of code on HPC clusters where we want interactivity and persistence.
Sharing code in a literate/exploratory way, i.e. with colleagues or students.
Maybe building dashboards?

Places where I might prefer Jupyter:

if I were working on some system that uses Jupyter but bans Marimo. This might arise in situations like Google Colab, where Jupyter is the primary interface or in other turnkey data-science systems.

Places where I would use neither:

When I am working inside VS Code on my local machine and have my IDE set up just how I like it, and have no need to share with others. Then I will use that and plot using the local GUI infrastructure and the local AI coding assistants.

Note that this leaves little niche for Jupyter in my life.

2 Installation

pip install marimo

See Getting Started with marimo for other options.

3 IDE integration

IDE integration is marimo’s weak suit for me thus far. I use VS Code for Python — the marimo extension exists but is somewhat janky.

I tend not to use it, running marimo notebooks in the browser.

As per this GitHub issue, I needed the following config for VS Code to find the marimo interpreter and stop it from beachballing forever:

{
  "python.defaultInterpreterPath": "${workspaceFolder}/.venv/bin/python",
  "marimo.marimoPath": "${workspaceFolder}/.venv/bin/marimo",
  "marimo.debug": true
}

If you have your local Python environment somewhere else, you need to change that too.

It seems to be incompatible with ruff auto-linting.

4 Markdown

There is rich markdown support. Nice. It might not behave as expected; Markdown is also generated by Python code, so markdown cells are not rendered until they are executed, which is not what classic notebooks such as Mathematica or Jupyter do. On the other hand, it has plus sides, like you can add Python code to your

I could not see it documented, but for markdown to work, the first cell in the notebook should say

import marimo as mo

Symptoms of not doing this: the error NameError('name 'mo' is not defined').

5 Remote access

Marimo runs a web app that can be accessed remotely. One can forward connections manually. Pro-tip: it will automatically set up a tunnel if you run it using a VS Code Remote connection.

6 Debugging

For some reason it’s only documented in an image on LinkedIn, but interactive debuggers are supported.

pdb.set_trace()
breakpoint()

7 Distributing marimo notebooks

7.1 In Python packages

You already know how to do this via setuptools.

7.2 As environments with self-contained requirements

Nifty! See the following intros

This takes advantage of the PEP 723 inline metadata mechanism, where a code comment at the top of a Python file can list package dependencies (and their versions).

I tried this out by installing marimo using uv:
uv tool install --python=3.12 marimo
Then grabbing one of their example notebooks:
wget 'https://raw.githubusercontent.com/marimo-team/spotlights/main/001-anywidget/tldraw_colorpicker.py'
And running it in a fresh dependency sandbox like this:
marimo run --sandbox tldraw_colorpicker.py

7.3 In the browser

It can run (purely) in the browser without installing Python.

8 Tips

8.1 Dotenv

dotenv is weird in marimo:

dotenv.load_dotenv(dotenv.find_dotenv(usecwd=True))

8.2 Caching some outputs

There is a native cache that will cache the output of a cell if we want to.

8.3 Extra UI widgets

Extra UI widgets? koaning/wigglystuff: A collection of creative AnyWidgets for Python notebook environments.

8.4 Quarto integration

Prototype Quarto integration: marimo-quarto.

8.5 Execution order

Execution order is worth reading about: marimo ensures that cells are executed in a consistent order to maintain reproducibility. This means that if you modify a cell, marimo will automatically re-run any dependent cells to ensure the notebook’s state is consistent. That means cells are not executed in the order you see them on the page, so e.g. you can put boilerplate imports at the end of the notebook. I would not do that because why introduce weirdness?

9 File format

The file format of marimo is clever. It uses Python code to encode Python code. That might not sound revolutionary, but Jupyter used JSON to encode Python code and that has created an ongoing quagmire.

In marimo, there are Jupyter-like cells, but they get their functionality via decorators. They also execute like normal Python code when needed. Here is an example of what a marimo notebook looks like on the inside:

import marimo

__generated_with = "0.9.32"
app = marimo.App(width="medium")

@app.cell
def __():
    import marimo as mo
    return (mo,)

@app.cell
def __():
    print("Hello world")
    return

@app.cell
def __(mo):
    mo.md(
        r"""
        ## Markdown is supported

        You can write in **bold**.
        """
    )
    return

if __name__ == "__main__":
    app.run()

Most notebooks can be exported to vanilla Python scripts with the marimo export script command.

marimo export script your_notebook.py -o your_notebook_script.py

NB This doesn’t work if crazy asynchronous stuff is going on.