Configuring machine learning experiments

October 20, 2021 — May 22, 2024

computers are awful
faster pussycat
how do science
information provenance
premature optimization

A dual problem to experiment tracking is experiment configuring; How can I nicely define experiments and the parameters which make them go?

Figure 1

1 Hydra

If you are working in python, this does more or less everything. See my Hydra page. Too heavy for some uses.

2 Gin

gin-config configures default parameters in a useful way for ML experiments. It is a bit more limited than Hydra, but also lighter.

3 ml-metadata

ML Metadata (MLMD) is a library for recording and retrieving metadata associated with ML developer and data scientist workflows. MLMD is an integral part of TensorFlow Extended (TFX), but is designed so that it can be used independently.

Every run of a production ML pipeline generates metadata containing information about the various pipeline components, their executions (e.g. training runs), and resulting artifacts(e.g. trained models). In the event of unexpected pipeline behavior or errors, this metadata can be leveraged to analyze the lineage of pipeline components and debug issues.Think of this metadata as the equivalent of logging in software development.

MLMD helps you understand and analyze all the interconnected parts of your ML pipeline instead of analyzing them in isolation and can help you answer questions about your ML pipeline such as:

  • Which dataset did the model train on?
  • What were the hyperparameters used to train the model?
  • Which pipeline run created the model?
  • Which training run led to this model?

See MLMD guide.

4 DrWatson.jl

As mentioned under experiment tracking, DrWatson which automatically attaches code versions to simulation and does some other work to generally keep simulations tracked and reproducible. Special feature: Works with julia, which is my other major language.

5 Configuration.jl

Another julia entrant.

6 Allennlp Param

Allennlp’s Param system is a kind of introductory trainer-wheels configuration system, but not recommended in practice. It come with a lot of baggage — installing it will slurp in a large number of fragile and fussy dependencies for language parsing. Once I used this for a while I realised all the reasons I would want a better system, which is provided by hydra.


Why use an external library for this? I could of course, roll my own. I have done that a few times. It is a surprisingly large amount of work though, remarkably easy to get wrong, and there are perfectly good tools to do it already.