Implementing neural nets


The internet is full of guides to training neural nets. Here are some selected highlights.

Michael Nielson has a free online textbook with code examples in python. Christopher Olah’s visual explanations make many things clear.

Andrej’s popular unromantic messy guide to training neural nets in practice has a lot of tips that people tend to rediscover the hard way if they do not get them from him. (I did)

It is allegedly easy to get started with training neural nets. Numerous libraries and frameworks take pride in displaying 30-line miracle snippets that solve your data problems, giving the (false) impression that this stuff is plug and play. … Unfortunately, neural nets are nothing like that. They are not “off-the-shelf” technology the second you deviate slightly from training an ImageNet classifier.

NN Software

I have used

I could use any of the other autodiff systems, such as…

  • Intel’s ngraph, which compiles neural nets esp for CPUs
  • Collaboratively build, visualize, and design neural nets in browser
  • Python: Theano (now defunct) was a trailblazer
  • Lua: Torch (in practice deprecated in favour of pytorch)
  • MATLAB/Python: Caffe claims to be a “de facto standard”
  • Python/C++: Paddlepaddle is Baidu’s NN machine
  • Minimalist C++: tiny-dnn is a C++11 implementation of deep learning. It is suitable for deep learning on limited-compute, embedded systems and IoT devices.
  • javascript: see javascript machine learning
  • julia: Various

Tracking experiments

See experiment tracking in ML.

Monitoring progress


NB: Last time I configured tensorboard manually was 2019; may be out of data.

Tensorboard is a de facto debugging tool standard. I recommend reading Li Yin’s explanation.


tensorboard --logdir=path/to/log-directory

or, more usually,

tensorboard --logdir=name1:/path/to/logs/1,name2:/path/to/logs/2 --host=localhost

or, lazily, (bash)

tensorboard --logdir=$(ls -dm *.logs |tr -d ' \n\r') --host=localhost


tensorboard --logdir=(string join , (for f in *.logs; echo (basename $f .logs):$f; end)) --host=localhost

In fact, that sometimes works not so well for me. Tensorboard reeeeally wants you to explicitly specify your folder names.

#!/bin/env python3

from pathlib import Path
from subprocess import run
import sys

p = Path('./')

logdirstring = '--logdir=' + ','.join([
  str(d)[:-5] + ":" + str(d)
  for d in p.glob('*.logs')

proc = run(
  • Projector visualises embeddings:

    TensorBoard has a built-in visualizer, called the Embedding Projector, for interactive visualization and analysis of high-dimensional data like embeddings. It is meant to be useful for developers and researchers alike. It reads from the checkpoint files where you save your tensorflow variables. Although it’s most useful for embeddings, it will load any 2D tensor, potentially including your training weights.

Weights and biases

A more full-featured model-training-tracking system, also includes interesting visualisations

pre-computed/trained models

Managing axes

Practically a lot of the time managing deep learning is remembering which axis is which.

Alexander Rush argues you want a NamedTensor. Implementations:

Practically, I have found Einops to solve all my needs in practice.

Scaling up

See Gradient descent at scale.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.