joblib is a simple python scientific computing library with basic mapreduce and some nice caching that integrates well. Not fancy, but super easy, which is what an academic usually wants, since fancy would imply we have a personnel budget.

>>> from math import sqrt
>>> from joblib import Parallel, delayed
>>> Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in range(10))
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]

Embarrassingly parallel for loops

2.2 Multiprocess

For scientific computing I might want to use multiprocess instead.

2.3 ipython native

Ipython spawning overview. ipyparallel is the built-in jupyter option with less pluggability but much ease.

3 Pathos

Pathos

[…] is a framework for heterogeneous computing. It primarily provides the communication mechanisms for configuring and launching parallel computations across heterogeneous resources. Pathos provides stagers and launchers for parallel and distributed computing, where each launcher contains the syntactic logic to configure and launch jobs in an execution environment. Some examples of included launchers are: a queue-less MPI-based launcher, a ssh-based launcher, and a multiprocessing launcher. Pathos also provides a map-reduce algorithm for each of the available launchers, thus greatly lowering the barrier for users to extend their code to parallel and distributed resources. Pathos provides the ability to interact with batch schedulers and queuing systems, thus allowing large computations to be easily launched on high-performance computing resources.

Integrates well with jupyter notebooks.

4 Dask

dask seems to parallelize certain python tasks well and claims to scale up elastically. It’s purely for python.

4.1 dask.distributed

dask.distributed expands slightly on joblib to handle networked computer clusters and also does load management even without a cluster. In fact it integrates with joblib.

5 Dispy

dispy (HT cpill) seems to be a python solution if I have a mess of machines lying around to borrow.

dispy is a comprehensive, yet easy to use framework for creating and using compute clusters to execute computations in parallel across multiple processors in a single machine (SMP), among many machines in a cluster, grid or cloud. dispy is well suited for data parallel (SIMD) paradigm where a computation (Python function or standalone program) is evaluated with different (large) datasets independently with no communication among computation tasks (except for computation tasks sending Provisional/Intermediate Results or Transferring Files to the client

6 pytorch

has special needs. See pytorch distributed.