Assumed audience:

People who run code on multiple machines

Figure 1

Tips for configuring ML and other apps.

There are many, many tools to load environment variables from local files. A good place to find generic resources on that is “Twelve-Factor App” configuration. But that includes eleven more factors than I personally care about because I am not a web developer; I just want the environment config part.

Previously this page was about the python tool dotenv. But actually, why bother restricting ourselves to python? Let’s configure environment variables from the shell where we actually need them. I have now switched to direnv, which does everything I need.

1 direnv

direnv is a small shell extension that automatically loads and unloads environment variables depending on the directory you are in. Put a file called .envrc in your project root and direnv will evaluate it whenever you cd into that directory. Leave the directory, and those variables are removed from your shell.

1.1 Installation

1.1.1 Generic (rootless, works everywhere)

Download a release binary from GitHub and place it in your $PATH:

curl -sfL https://direnv.net/install.sh | bash

This installs into ~/.local/bin by default (create that directory if needed).

Make sure ~/.local/bin is in your $PATH:

  • Bash / Zsh

    echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc   # or ~/.zshrc
  • Fish

    set -Ux PATH $HOME/.local/bin $PATH

1.1.2 Package managers

brew install direnv      # homebrew
sudo apt install direnv  # debian etc

1.1.3 Verify installation

After installation, hook direnv into your shell so it runs on every prompt:

# Bash
echo 'eval "$(direnv hook bash)"' >> ~/.bashrc

# Zsh
echo 'eval "$(direnv hook zsh)"' >> ~/.zshrc

# Fish
echo 'direnv hook fish | source' >> ~/.config/fish/config.fish

Restart your shell afterwards.

Check that direnv is installed and hooked correctly:

direnv version     # should print the installed version
direnv status      # should show "Loaded RC allowed 0" if no .envrc is active

If direnv status shows errors or nothing about hooks, double-check that you restarted your shell and that the hook line is present in your ~/.bashrc, ~/.zshrc, or fish config.

1.2 Using .envrc

A minimal .envrc looks like this

export DATA_PATH="$HOME/data"                        # set a var
export RESULTS_DIR="${RESULTS_DIR:-/tmp/results}"    # set a var if not set
  • If we don’t define RESULTS_DIR in advance, it defaults to /tmp/results.
  • If we do export it manually (export RESULTS_DIR=/scratch/me), that takes precedence.

We activate it with:

direnv allow

This whitelists the .envrc. If I later edit it (or pull changes from git), I must re-run direnv allow. I can force a reload at any time with direnv reload.

1.3 Benefits

  • Variables are set before you run commands in that directory.
  • Different projects can have different .envrc files without clashing.
  • Defaults can be layered: project-specific defaults, common fallbacks, and user overrides.

1.4 Pitfalls

  • Environment is tied to the current directory. If I cd somewhere else, the variables are automatically unloaded. Scripts I start keep the environment they inherit, but your interactive shell won’t.
  • If I open an interactive subshell with the direnv hook active and then cd, the variables may be cleared inside that subshell too. Non-interactive shells are fine
  • Because .envrc is executable Bash, it can run arbitrary code. It is wise to review and allow it explicitly.

In practice, direnv gives us the simplicity of .env files, but integrated into the shell, making configuration language-agnostic and convenient for both Python scripts and general tools.

2 Python dotenv

One system I have used is dotenv. dotenv allows easy configuration through OS environment variables or text files in the parent directory.

There are lots of packages with similar names but dissimilar functions.

pip install python-dotenv # or
conda install -c conda-forge python-dotenv

Also similar, henriquebastos/python-decouple, sloria/environs. Dynaconf is sophisticated and comes closer to a full configuration system like hydra, and as such is too much for me.

Let us imagine we are using basic dotenv for now for concreteness. Then we can be indifferent to whether files came from an FS config or an environment variable.

import os, os.path
from dotenv import load_dotenv
load_dotenv()  # take environment variables from .env.
# Code of your application, which uses environment variables (e.g. from `os.environ` or
# `os.getenv`) as if they came from the actual environment.
# substituting a var into a path:
DATA_FILE_PATH = os.path.expandvars('$DATA_PATH/$DATA_FILE')
# getting a var with a default fallback
FAVOURITE_PIZZA_TOPPING = os.getenv('FAVOURITE_PIZZA_TOPPING', 'cheese')

The datafile .env is just a text file with lines like

DATA_PATH=/home/username/data
DATA_FILE=foo.csv
FAVOURITE_PIZZA_TOPPING=anchovies

There is a CLI too; its most useful feature is executing arbitrary stuff with the correct environment variable set.

pip install "python-dotenv[cli]"
dotenv run my_cool_script.py

This only works for running Python scripts AFAICT. So actually, why not just use direnv?