Implementing neural nets

October 14, 2016 — January 27, 2023

computers are awful
machine learning
neural nets
Figure 1


The internet is full of guides to training neural nets. Here are some selected highlights.

Michael Nielson has a free online textbook with code examples in python. Christopher Olah’s visual explanations make many things clear.

Andrej’s popular unromantic messy guide to training neural nets in practice has a lot of tips that people tend to rediscover the hard way if they do not get them from him. (I did)

It is allegedly easy to get started with training neural nets. Numerous libraries and frameworks take pride in displaying 30-line miracle snippets that solve your data problems, giving the (false) impression that this stuff is plug and play. … Unfortunately, neural nets are nothing like that. They are not “off-the-shelf” technology the second you deviate slightly from training an ImageNet classifier.

2 Profiling and performance optimisation

3 NN Software

I have used

I could use any of the other autodiff systems, such as…

Figure 2

3.1 Compiled

See edge ml for a discussion of compiled NNs.

4 Tracking experiments

See experiment tracking in ML.

5 Configuring experiments

See configuring experiments; in practice I use hydra for everything.

6 pre-computed/trained models

7 Managing axes

A lot of the time managing deep learning is remembering which axis is which. Practically, I have found Einstein convention to solve all my needs.

However, there are alternatives. Alexander Rush argues for NamedTensor. Implementations:

8 Scaling up

See Gradient descent at scale.

9 Incoming