Neural networks incorporating basis decompositions.
Why might you want to do this? For one it is a different lense to analyze neural nets’ mysterious success through. For another, it gives you interpolation for free. There are possibly other reasons - perhaps the right basis gives you better priors for undersstanding a partial differential equation? Or something else?
Neural networks with continuous basis functions
Closer to my own interests: Can I learn neural networks which are grid free, i.e. which can be resampled? Can I uses continuous bases in the computation of a neural net? This is very useful in things like learning PDEs. The virtue of these things is that they do not depend (much?) upon the scale of some grid. Possibly this naturally leads to us being able to sample the problem very sparsely. It also might allow us to interpolate sparse solutions. In addition, analytic basis functions are easy to differentiate; we can use autodiff to find their local gradients, even deep ones.
There are various ways other to do native interpolation; One hack uses the implicit representation method which is clever, but not plausible for my purposes, where something better behaved like a basis function interpretation is more helpful.
Specifically, I would like to do Bayesian inference which looks extremely hard through an implicit net, but only very hard through a basis decomposition.
In practice, how would I do this?
Using a well-known basis, such as orthogonal polynomial or Fourier bases, creating a layer which encodes your net is easy. After all, that is just an inner product. That is what methods like that of Li et al. (2020) exploit.
I would probably not attempt to learn an arbitrary sparse basis dictionary in this context, because that does not interpolate naturally, but I can imagine learning a parametric sparse dictionary, such as one defined by some simple basis such as decaying sinusoids.
Somewhere in between there are wavelet decompositions. Are they useful to me? Not sure.
Convolutional neural networks as sparse coding
Elad and Papyan and others have a miniature school of Deep Learning analysis based on Multi Layer Convolutional Sparse Coding (Papyan, Romano, and Elad 2017; Papyan et al. 2018; Papyan, Sulam, and Elad 2017; Sulam et al. 2018). This combines sparse basis learning with neural nets, which is cool.
The recently proposed multilayer convolutional sparse coding (ML-CSC) model, consisting of a cascade of convolutional sparse layers, provides a new interpretation of convolutional neural networks (CNNs). Under this framework, the forward pass in a CNN is equivalent to a pursuit algorithm aiming to estimate the nested sparse representation vectors from a given input signal. …Our work represents a bridge between matrix factorization, sparse dictionary learning, and sparse autoencoders, and we analyze these connections in detail.
However, as interesting as this sounds, I am not deeply engaged with it, since this does not solve any immediate problems for me.
Not to be confused with implicit representation layers which are completely different.↩︎