Configuring machine learning experiments

A dual problem to experiment tracking is experiment configuring; How can I nicely define experiments and the parameters which make them go?


My go-to. If you are working in python this does more or less everything, and is the least annoying to work with IMO. See my Hydra page.


ML Metadata (MLMD) is a library for recording and retrieving metadata associated with ML developer and data scientist workflows. MLMD is an integral part of TensorFlow Extended (TFX), but is designed so that it can be used independently.

Every run of a production ML pipeline generates metadata containing information about the various pipeline components, their executions (e.g. training runs), and resulting artifacts(e.g. trained models). In the event of unexpected pipeline behavior or errors, this metadata can be leveraged to analyze the lineage of pipeline components and debug issues.Think of this metadata as the equivalent of logging in software development.

MLMD helps you understand and analyze all the interconnected parts of your ML pipeline instead of analyzing them in isolation and can help you answer questions about your ML pipeline such as:

  • Which dataset did the model train on?
  • What were the hyperparameters used to train the model?
  • Which pipeline run created the model?
  • Which training run led to this model?

See MLMD guide.


As mentioned under experiment tracking, DrWatson which automatically attaches code versions to simulation and does some other work to generally keep simulations tracked and reproducible. Special feature: Works with julia, which is my other major language.

Allennlp Param

Allennlp’s Param system is a kind of introductory trainer-wheels configuration system, but not recommended in practice. It come with a lot of baggage — installing it will slurp in a large number of fragile and fussy dependencies for language parsing. Once I used this for a while I realised all the reasons I would want a better system, which is provided by hydra.


Why use an external library for this? I could of course, roll my own. I have done that a few times. It is a surprisingly large amount of work though, remarkably easy to get wrong, and there are perfectly good tools to do it already.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.