Experiment tracking for machine learning

Experiment tracking, specialised for ML and in particular neural nets is its own sub-field these days. This is the nuts-and-bolts end of how we allow for reproducibility in AI, even while we have complicated model fitting process and many long slow computation steps even as the code changes through a complicated development process.

Neptune reviews a few options including their own product.

Weights and biases

A more full-featured model-training-tracking system.

For my purposes the most handy entry point is their Experiment Tracking.

Track and visualize experiments in real time, compare baselines, and iterate quickly on ML projects

Use the wandb Python library to track machine learning experiments with a few lines of code. If you're using a popular framework like PyTorch or Keras, we have lightweight integrations.

You can then review the results in an interactive dashboard or export your data to Python for programmatic access using our Public API.

Test Tube

Tracking is one of the powers of test tube

williamFalcon/test-tube: Python library to easily log experiments and parallelize hyperparameter search for neural networks

It also has some useful parallelism management, especially for HPC systems.





For julia there is DrWatson which automatically attaches code versions to simulation and does some other work to generally keep simulations tracked and reproducible.


Is this still current?

ML Metadata (MLMD) is a library for recording and retrieving metadata associated with ML developer and data scientist workflows. MLMD is an integral part of TensorFlow Extended (TFX), but is designed so that it can be used independently.

Every run of a production ML pipeline generates metadata containing information about the various pipeline components, their executions (e.g. training runs), and resulting artifacts(e.g. trained models). In the event of unexpected pipeline behavior or errors, this metadata can be leveraged to analyze the lineage of pipeline components and debug issues.Think of this metadata as the equivalent of logging in software development.

MLMD helps you understand and analyze all the interconnected parts of your ML pipeline instead of analyzing them in isolation and can help you answer questions about your ML pipeline such as:

  • Which dataset did the model train on?
  • What were the hyperparameters used to train the model?
  • Which pipeline run created the model?
  • Which training run led to this model?


A dual problem to experiment tracking is experiment configuring. How do I even set up those parameters?

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.