submitit
A less horrible way to parallelize on HPC clusters
October 9, 2021 — December 4, 2024
My current go-to option for Python HPC jobs. If anyone has ever told you to run some code by creating a bash script full of esoteric cruft like #SBATCH -N 10
, then you have experienced how user-hostile classic computer clusters are. Wading-through-punctuation-swamp vibe of submitting jobs by vanilla SLURM or TORQUE is slow and unwieldy.
submitit wraps all that nonsense up tidily to keep the cluster compute workflow uncluttered. This keeps the process of executing code on the cluster much more like classic parallel code execution on Python, which is to say, annoying-but-tolerable.
1 Examples
submitit
looks like this in basic form:
import submitit
def add(a, b):
return a + b
# executor is the submission interface
executor = submitit.AutoExecutor(folder="log_test")
# set timeout in min, and partition for running the job
executor.update_parameters(
timeout_min=1,
slurm_partition="dev",
)
job = executor.submit(add, 5, 7) # will compute add(5, 7)
print(job.job_id) # ID of your job
output = job.result() # waits for completion and returns output
assert output == 12 # 5 + 7 = 12... your addition was computed in the cluster
The docs could be better. Mostly, work it out by reading the examples:
Here is a more advanced pattern I use when running a bunch of experiments via submitit:
import bz2
import cloudpickle
job_name = "my_cool_job"
def expensive_calc(a):
return a + 1
# executor is the submission interface (logs are dumped in the folder)
executor = submitit.AutoExecutor(folder="log_test")
# set timeout in min, and partition for running the job
executor.update_parameters(
timeout_min=1,
slurm_partition="dev", #misc slurm args
tasks_per_node=4, # number of cores
mem=8, # memory in GB I think
)
#submit 10 jobs:
jobs = [executor.submit(expensive_calc, num) for num in range(10)]
# but actually maybe we want to come back to those jobs later, so let’s save them to disk
with bz2.open(job_name + ".job.pkl.bz2", "wb") as f:
cloudpickle.dump(jobs, f)
# We can quit this python session and do something else now
# Resume after quitting
with bz2.open(job_name + ".job.pkl.bz2", "rb") as f:
jobs = cloudpickle.load(f)
# wait for all jobs to finish
[job.wait() for job in jobs]
res_list = []
# # Alternatively append these results to a previous run
# with bz2.open(job_name + ".pkl.bz2", "rb") as f:
# res_list.extend(pickle.load(f))
# Examine job outputs for failures etc
fail_ids = [job.job_id for job in jobs if job.state not in ('DONE', 'COMPLETED')]
res_list.extend([job.result() for job in jobs if job.job_id not in fail_ids])
failures = [job for job in jobs if job.job_id in fail_ids]
if failures:
print("failures")
print("===")
for job in failures:
print(job.state, job.stderr())
# Save results to disk
with bz2.open(job_name + ".result.pkl.bz2", "wb") as f:
cloudpickle.dump(res_list, f)
Cool feature: the spawning script only needs to survive as long as it takes to put jobs on the queue, and then it can die. Later on we can reload those jobs from disk as if it all happened in the same session.
2 Gotchas
2.1 tasks per node
If I set tasks_per_node
to 4, and I submit 10 jobs, I get 40 jobs running on the cluster. This is unexpected for me but see the examples.
I am not sure what the use case for this is. Forking multiple tasks with the same argument is maybe slightly quicker to transmit, but requires more free CPUs to be allocated, and breaks the tidiness of the calling convention. I cannot think of a circumstance where this would be a net win.
2.2 DebugExecutor considered harmful
When developing a function using submitit
, it might seem convenient to use DebugExecutor
sometimes, to work out why things are crashing. DebugExecutor
is a has radically different semantics masquerading as the same. Code runs sequentially in-process, so DebugExecutor
is blocking, and you can expect asycnhronous code to behave differently when run in DebugExecutor
than in AutoExecutor
.
If we don’t want to debug concurrency problems, but just the function itself, it might be easier to just execute the function in the normal way. executor=DebugExecutor();executor.submit(func, *args, **kwargs)
does not get us much new that we do not already get from running func(*args, **kwargs)
directly except a drop-in replacement for submitit.AutoExecutor
.
What DebugExecutor
does get us is one small extra gift that I did not ask for and do not relish: It always invokes pdb
, or ipdb
, the interactive Python debuggers, upon errors. As such DebugExecutor
cannot be run from VS Code, because VS Code does not support those interactive debuggers. I dislike this feature strongly; even though I love debuggers, I find it best to invoke them myself, thank you very much.
2.3 Weird argument handling
submitit believes that the slurm_mem
argument requisitions memory in GB, but slurm interprets it as MB per default (see --mem
). There is an alternate argument, mem_gb
, which allocates GB of memory.
3 hydra
Pro tip: submitit
integrates with hydra.