Ensemble Kalman methods for training neural networks

Data assimilation for network weights

September 20, 2022 — September 20, 2022

Bayes
dynamical systems
likelihood free
linear algebra
machine learning
Monte Carlo
neural nets
nonparametric
particle
probability
sciml
signal processing
sparser than thou
statistics
statmech
stochastic processes
time series
uncertainty

\[\renewcommand{\var}{\operatorname{Var}} \renewcommand{\cov}{\operatorname{Cov}} \renewcommand{\dd}{\mathrm{d}} \renewcommand{\bb}[1]{\mathbb{#1}} \renewcommand{\vv}[1]{\boldsymbol{#1}} \renewcommand{\rv}[1]{\mathsf{#1}} \renewcommand{\vrv}[1]{\vv{\rv{#1}}} \renewcommand{\disteq}{\stackrel{d}{=}} \renewcommand{\gvn}{\mid} \renewcommand{\Ex}{\mathbb{E}} \renewcommand{\Pr}{\mathbb{P}} \renewcommand{\one}{\unicode{x1D7D9}}\]

Figure 1

Training neural networks by ensemble Kalman updates instead of SGD. Arises naturally from the dynamical perspective on neural networks. TBD.

Claudia Schilling’s filter (Schillings and Stuart 2017) is an elegant variant of the ensemble Kalman Filter which looks somehow more general than the original but also simpler. Haber, Lucka, and Ruthotto (2018) use it to train neural nets (!) and show a rather beautiful connection to stochastic gradient descent in section 3.2.

1 References

Chada, Chen, and Sanz-Alonso. 2021. Iterative Ensemble Kalman Methods: A Unified Perspective with Some New Variants.” Foundations of Data Science.
Chada, Iglesias, Roininen, et al. 2018. Parameterizations for Ensemble Kalman Inversion.” Inverse Problems.
Chen, Dou, Chen, et al. 2022. A Novel Neural Network Training Framework with Data Assimilation.” The Journal of Supercomputing.
Dunbar, Duncan, Stuart, et al. 2022. Ensemble Inference Methods for Models With Noisy and Expensive Likelihoods.” SIAM Journal on Applied Dynamical Systems.
Galy-Fajou, Perrone, and Opper. 2021. Flexible and Efficient Inference with Particles for the Variational Gaussian Approximation.” Entropy.
Guth, Schillings, and Weissmann. 2020. Ensemble Kalman Filter for Neural Network Based One-Shot Inversion.”
Haber, Lucka, and Ruthotto. 2018. Never Look Back - A Modified EnKF Method and Its Application to the Training of Neural Networks Without Back Propagation.” arXiv:1805.08034 [Cs, Math].
Haykin, ed. 2001. Kalman Filtering and Neural Networks. Adaptive and Learning Systems for Signal Processing, Communications, and Control.
Huang, Schneider, and Stuart. 2022. Iterated Kalman Methodology for Inverse Problems.” Journal of Computational Physics.
Iglesias, Law, and Stuart. 2013. Ensemble Kalman Methods for Inverse Problems.” Inverse Problems.
Kovachki, and Stuart. 2019. Ensemble Kalman Inversion: A Derivative-Free Technique for Machine Learning Tasks.” Inverse Problems.
Liu, Zhu, and Belkin. 2020. On the Linearity of Large Non-Linear Models: When and Why the Tangent Kernel Is Constant.” In Advances in Neural Information Processing Systems.
Ritter, Kukla, Zhang, et al. 2021. Sparse Uncertainty Representation in Deep Learning with Inducing Weights.” arXiv:2105.14594 [Cs, Stat].
Schillings, and Stuart. 2017. Analysis of the Ensemble Kalman Filter for Inverse Problems.” SIAM Journal on Numerical Analysis.
Taghvaei, and Mehta. 2021. An Optimal Transport Formulation of the Ensemble Kalman Filter.” IEEE Transactions on Automatic Control.
Venturi, and Li. 2022. The Mori-Zwanzig Formulation of Deep Learning.”
Wen, and Li. 2022. Affine-Mapping Based Variational Ensemble Kalman Filter.” Statistics and Computing.
Yegenoglu, Krajsek, Pier, et al. 2020. Ensemble Kalman Filter Optimizing Deep Neural Networks: An Alternative Approach to Non-Performing Gradient Descent.” In Machine Learning, Optimization, and Data Science.