Python packaging, versioning and isolating


How do you install the right versions of everything for some python code you are developing? How to deploy that sustainably? How to share it with others.

Not so hard, but confusing and chaotic due to many long-running disputes only lately resolving.

General

pip

The default python package installer. It is best spelled

python -m pip install package_name

Pro tip: pipx:

pip is a general-purpose package installer for both libraries and apps with no environment isolation. pipx is made specifically for application installation, as it adds isolation yet still makes the apps available in your shell: pipx creates an isolated environment for each application and its associated packages.

Anaconda

The distribution you use if you want to teach a course in numerical python without dicking around with a 5 hour install process.

Setup

Download e.g. Linux x64 Miniconda, from the download page.

bash Miniconda3-latest-Linux-x86_64.sh
# login/logout here
# or do something like `exec bash -` if you are fancy
# Less aggressive conda
conda config --set auto_activate_base false
# conda for fish users
conda init fish

Has a slightly different packaging workflow. See, e.g. Tim Hoppper’s workflow which explains this environment.yml malarkey, or the creators’ rationale and manual.

The upshot for the end user is that if I want to install something with tricky dependencies like ViTables, I do this:

conda install pytables=3.2
conda install pyqt=4

Aside: I use fish shell, so need to do some extra setup. Specifically, I add the line

source (conda info --root)/etc/fish/conf.d/conda.fish

into ~/.config/fish/config.fish.

For jupyter compatibility one needs

conda install nb_conda_kernels

Care and feeding

NB Conda will fill up your hard disk if not regularly disciplined. via conda clean.

conda clean -pt

One exports the current conda environment config, by convention, into environment.yml.

conda env export > environment.yml
conda env create -f environment.yml

No MKL

I might also want to not have the gigantic MKL library installed, not being a fan. You can usually disable it per request:

conda create -n pynomkl python nomkl

This does not always work, and clearly the packagers do not test it so often, because it fails sometimes. Worth trying, however. Between the various versions and installed copies, MKL alone was using about 10GB total on my mac when I last checked. I also try to reduce the number of copies of MKL by starting from miniconda as my base anaconda distribution, cautiously adding things as I need them.

Local environment

Local environment folder is more isolated, rather than keeping all environments somewhere global.

conda config --set env_prompt '({name})' 
conda env create --prefix ./env/myenv -f environment_linux.yml 
conda activate ./env/myenv

Gotcha: in fish shell the first line needs to be

conda config --set env_prompt '\({name}\)' 

I am not sure why. AFAIK, fish command substitution does not happen inside strings.

Either way, this will add the line

env_prompt: ({name})

to .condarc.

venv

venv is now a built-in python virtual environment system in python 3. It doesn’t support python 2 but fixes various problems, e.g. it supports framework python on macOS which is important for GUIs, and is covered by the python docs in the python virtual environment introduction. It has a higher-level, er, …wrapper (?) called pipenv.

# Create venv
python3 -m venv ~/.virtualenvs/learning_gamelan_keras_2
# Use venv from fish
source ~/.virtualenvs/learning_gamelan_keras_2/bin/activate.fish
# Use venv from bash
source ~/.virtualenvs/learning_gamelan_keras_2/bin/activate

Python environment management management

One suggestion I’ve heard is to use pyenv. which eases and automates switching between all the other python environments created by virtualenv, python.org python, os python, anaconda python etc.

BUT WHO MANAGES THE VIRTUALENV MANAGER MANAGER? What is going on?

Logan Jones explains:

  • pyenv manages multiple versions of Python itself.
  • virtualenv/venv manages virtual environments for a specific Python version.
  • pyenv-virtualenv manages virtual environments for across varying versions of Python.

Anyway, pyenv compiles a custom version of python and as such is extremely isolated from everything else. Here is an introduction with emphasis on my area: Intro to Pyenv for Machine Learning.

Of course, because this is a python packaging solution, it immediately becomes complicated and confusing when you try to interact with the rest of the ecosystem, e.g.,

Attention: This plugin is different from pyenv-virtualenv, which provides extended commands like pyenv virtualenv 3.4.1 project_name to directly help out with managing virtualenvs. pyenv-virtualenvwrapper helps in interacting with virtualenvwrapper, but pyenv-virtualenv provides more convenient commands, where virtualenvs are first-class pyenv versions, that can be (de)activated. That’s to say, pyenv and virtualenvwrapper are still separated while pyenv-virtualenv is a nice combination.

Huh. I am already too bored to think.

However, I did nut out a command which installed a pyenv tensorflow with an isolated virtualenv:

brew install pyenv pyenv-virtualenv
pyenv install 3.8.6
pyenv virtualenv 3.8.6 tf2.4
pyenv activate tf2.4
pip install --upgrade pip wheel
pip install 'tensorflow-probability>=0.12' 'tensorflow<2.5' jupyter

For fish shell you need to add some special lines to config.fish:

set -x PYENV_ROOT $HOME/.pyenv
set -x PATH $PYENV_ROOT/bin $PATH
## fish <3.1
# status --is-interactive; and . (pyenv init -|psub)
# status --is-interactive; and . (pyenv virtualenv-init -|psub)
## fish >=3.1
status --is-interactive; and pyenv init - | source
status --is-interactive; and pyenv virtualenv-init - | source

Warning! Experimental comments system! If is does not work for you, let me know via the contact form.

No comments yet!

GitHub-flavored Markdown & a sane subset of HTML is supported.