Tensorboard is a de facto debugging/tracking tool standard. It is easy-ish to install, hard to modify and works well enough (but not on our internal cluster). I recommend reading Li Yin’s explanation. It looks like it is closely coupled to tensorflow, because it used to be, but these days may be installed kinda-sorta separately. I tend to find it slightly annoying; it clutters up with crap pretty easily. That is also a strength — since it works by writing files to the filesystem it can get around the horrible network lockdowns that we often experience in HPC hell The torch manual shows a worked example, but the best walk-through IMO is Derek Mwiti’s TensorBoard Tutorial for neptune.ai which is regularly updated for new technology and across platforms.
There are two parts to using tensorboard:
- Writing data that tensorboard can read.
- running tensorboard to visualize the data
Part 1 looks like this in pytorch.
# torch from torch.utils.tensorboard import SummaryWriter LOG_DIR = "debug/" SUBLOG_PATH = os.path.join( LOG_PATH, datetime.datetime.now().strftime("%Y%m%d-%H%M%S")) writer = SummaryWriter(log_dir=SUBLOG_PATH) writer.add_graph(net, data) # ...log the running loss writer.add_scalar( 'training loss', running_loss / 1000, epoch * len(data) + i) # ...log a Matplotlib Figure showing the model’s predictions on a # random mini-batch writer.add_figure('predictions vs. actuals', plot_classes_preds(net, inputs, labels), global_step=epoch * len(trainloader) + i) writer.close()
See also torch.utils.tensorboard API Docs. There is lots of cool stuff that can be loggged, like 3d meshes and audio files.
Part 2, minimally,
Supposedly we can run tensorboard part 2 inside vs code too. This has never worked for me; the tensorboard instance never sees any log data and just sits there grinning bashfully. I cannot find anyone else on the internet who has experienced this problem despite much searching, so I gave up. Maybe Using TensorBoard in Notebooks works better? Running it from the command line is fine.
Handy trick: Projector visualises embeddings:
TensorBoard has a built-in visualizer, called the Embedding Projector, for interactive visualization and analysis of high-dimensional data like embeddings. It is meant to be useful for developers and researchers alike. It reads from the checkpoint files where you save your tensorflow variables. Although it’s most useful for embeddings, it will load any 2D tensor, potentially including your training weights.
Data from tensorboard experiments may be loaded back into python as a dataframe using
There are also community options.