Function space versus weight space in Neural Nets
2024-10-15 — 2024-10-15
Suspiciously similar content
On the tension between the representation of functions in function space and in weight space in neural networks. We “see” the outputs of neural networks as functions, generated by some inscrutable parameterization in terms of weights, which is more abstruse but also more tractable to learn in practice. Why might that be?
When we can learn directly in function space many things work better in various senses (see, e.g. GP regression), but such methods rarely dominate in messy practice. Why might that be? When can we operate in function space? Sometimes we really want to, e.g. in operator learning. How can we translate between the two?
Singualr learning theory seems very interested in connecting the weight-space optima with the functions learned by a neural net.
See also low rank GPs, partially Bayes NNs, neural tangent kernels, functional regression, functional inverse problems, overparameterization, wide limits of NNs…