Nonparametrically learning dynamical systems


Learning differntial equations, either ordinary or stochastic, using nonparametric neural techniques. This is what e.g. the famous Vector Institute Neural ODE paper (Chen et al. 2018) did, although I’m not sure its as novel as they imply, since it does look like a lot of earlier work. Related: Analysing a neural net itself as a dynamical system, whcih is not quite the same but crosses over.

Author Duvenaud argues that in some ways the hype ran away with the Neural ODE paper, and credits casadi with some of the innovations here. There are various laypersons’ introductions/ tutorials in this area, including the simple and practical magical take in julia. See also the CASADI example.

Learning an ODE in particular a purely deterministic process, feels unsatisfying; We want a model which encodes responses,and effects to interactions. It is not ideal to have time series models which need to encode everything in an initial state.

Also, we would prefer models to be stocahstic. Learnable SDEs are probably what we want. I’m particularly interested on jump ODE regression.

There are syntheses of these approaches that try to do everything with ODEs, all the time. (Rackauckas et al. 2018; Niu, Horesh, and Chuang 2019), and even some tutorial implementations by the indefatigable Chris Rackauckas, and a whole MIT course. Chris Rackauckas’ lecture notes christen this development “scientific machine learning”.

Learning stochastic partial differential equations where a whole random field evolves in time is something of interest to me; see spatiotemporal nets.

Andersson, Joel A. E., Joris Gillis, Greg Horn, James B. Rawlings, and Moritz Diehl. 2019. “CasADi: A Software Framework for Nonlinear Optimization and Optimal Control.” Mathematical Programming Computation 11 (1): 1–36. https://doi.org/10.1007/s12532-018-0139-4.

Anil, Cem, James Lucas, and Roger Grosse. 2018. “Sorting Out Lipschitz Function Approximation,” November. https://arxiv.org/abs/1811.05381v1.

Arjovsky, Martin, Amar Shah, and Yoshua Bengio. 2016. “Unitary Evolution Recurrent Neural Networks.” In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, 1120–8. ICML’16. New York, NY, USA: JMLR.org. http://arxiv.org/abs/1511.06464.

Arridge, Simon, Peter Maass, Ozan Öktem, and Carola-Bibiane Schönlieb. 2019. “Solving Inverse Problems Using Data-Driven Models.” Acta Numerica 28 (May): 1–174. https://doi.org/10.1017/S0962492919000059.

Babtie, Ann C., Paul Kirk, and Michael P. H. Stumpf. 2014. “Topological Sensitivity Analysis for Systems Biology.” Proceedings of the National Academy of Sciences 111 (52): 18507–12. https://doi.org/10.1073/pnas.1414026112.

Chang, Bo, Minmin Chen, Eldad Haber, and Ed H. Chi. 2019. “AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks.” In Proceedings of ICLR. http://arxiv.org/abs/1902.09689.

Chang, Bo, Lili Meng, Eldad Haber, Lars Ruthotto, David Begert, and Elliot Holtham. 2018. “Reversible Architectures for Arbitrarily Deep Residual Neural Networks.” In. http://arxiv.org/abs/1709.03698.

Chang, Bo, Lili Meng, Eldad Haber, Frederick Tung, and David Begert. 2018. “Multi-Level Residual Networks from Dynamical Systems View.” In PRoceedings of ICLR. http://arxiv.org/abs/1710.10348.

Chen, Tianqi, Ian Goodfellow, and Jonathon Shlens. 2015. “Net2Net: Accelerating Learning via Knowledge Transfer,” November. http://arxiv.org/abs/1511.05641.

Chen, Tian Qi, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. 2018. “Neural Ordinary Differential Equations.” In Advances in Neural Information Processing Systems 31, edited by S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, 6572–83. Curran Associates, Inc. http://papers.nips.cc/paper/7892-neural-ordinary-differential-equations.pdf.

Dupont, Emilien, Arnaud Doucet, and Yee Whye Teh. 2019. “Augmented Neural ODEs,” April. http://arxiv.org/abs/1904.01681.

E, Weinan. 2017. “A Proposal on Machine Learning via Dynamical Systems.” Communications in Mathematics and Statistics 5 (1): 1–11. https://doi.org/10.1007/s40304-017-0103-z.

E, Weinan, Jiequn Han, and Qianxiao Li. 2018. “A Mean-Field Optimal Control Formulation of Deep Learning,” July. http://arxiv.org/abs/1807.01083.

Garnelo, Marta, Dan Rosenbaum, Chris J. Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo J. Rezende, and S. M. Ali Eslami. 2018. “Conditional Neural Processes,” July, 10. https://arxiv.org/abs/1807.01613v1.

Garnelo, Marta, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J. Rezende, S. M. Ali Eslami, and Yee Whye Teh. 2018. “Neural Processes,” July. https://arxiv.org/abs/1807.01622v1.

Gholami, Amir, Kurt Keutzer, and George Biros. 2019. “ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs,” February. http://arxiv.org/abs/1902.10298.

Grathwohl, Will, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, and David Duvenaud. 2018. “FFJORD: Free-Form Continuous Dynamics for Scalable Reversible Generative Models,” October. http://arxiv.org/abs/1810.01367.

Haber, Eldad, Keegan Lensink, Eran Treister, and Lars Ruthotto. 2019. “IMEXnet A Forward Stable Deep Neural Network.” In International Conference on Machine Learning, 2525–34. PMLR. http://proceedings.mlr.press/v97/haber19a.html.

Haber, Eldad, Felix Lucka, and Lars Ruthotto. 2018. “Never Look Back - A Modified EnKF Method and Its Application to the Training of Neural Networks Without Back Propagation,” May. http://arxiv.org/abs/1805.08034.

Haber, Eldad, and Lars Ruthotto. 2018. “Stable Architectures for Deep Neural Networks.” Inverse Problems 34 (1): 014004. https://doi.org/10.1088/1361-6420/aa9a90.

Haber, Eldad, Lars Ruthotto, Elliot Holtham, and Seong-Hwan Jun. 2017. “Learning Across Scales - A Multiscale Method for Convolution Neural Networks,” March. http://arxiv.org/abs/1703.02009.

Han, Jiequn, Arnulf Jentzen, and Weinan E. 2018. “Solving High-Dimensional Partial Differential Equations Using Deep Learning.” Proceedings of the National Academy of Sciences 115 (34): 8505–10. https://doi.org/10.1073/pnas.1718942115.

Hardt, Moritz, Benjamin Recht, and Yoram Singer. 2015. “Train Faster, Generalize Better: Stability of Stochastic Gradient Descent,” September. http://arxiv.org/abs/1509.01240.

Haro, A. 2008. “Automatic Differentiation Methods in Computational Dynamical Systems: Invariant Manifolds and Normal Forms of Vector Fields at Fixed Points.” IMA Note. http://www.maia.ub.es/~alex/admcds/admcds.pdf.

He, Junxian, Daniel Spokoyny, Graham Neubig, and Taylor Berg-Kirkpatrick. 2019. “Lagging Inference Networks and Posterior Collapse in Variational Autoencoders.” In PRoceedings of ICLR. http://arxiv.org/abs/1901.05534.

Jing, Li, Yichen Shen, Tena Dubcek, John Peurifoy, Scott Skirlo, Yann LeCun, Max Tegmark, and Marin Soljačić. 2017. “Tunable Efficient Unitary Neural Networks (EUNN) and Their Application to RNNs.” In PMLR, 1733–41. http://proceedings.mlr.press/v70/jing17a.html.

Li, Xuechen, Ting-Kam Leonard Wong, Ricky T. Q. Chen, and David Duvenaud. 2020. “Scalable Gradients for Stochastic Differential Equations.” In International Conference on Artificial Intelligence and Statistics, 3870–82. PMLR. http://proceedings.mlr.press/v108/li20i.html.

Lu, Lu, Pengzhan Jin, and George Em Karniadakis. 2020. “DeepONet: Learning Nonlinear Operators for Identifying Differential Equations Based on the Universal Approximation Theorem of Operators,” April. http://arxiv.org/abs/1910.03193.

Meng, Qi, Yue Wang, Wei Chen, Taifeng Wang, Zhi-Ming Ma, and Tie-Yan Liu. 2016. “Generalization Error Bounds for Optimization Algorithms via Stability.” In, 10:441–74. http://arxiv.org/abs/1609.08397.

Mhammedi, Zakaria, Andrew Hellicar, Ashfaqur Rahman, and James Bailey. 2017. “Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections.” In PMLR, 2401–9. http://proceedings.mlr.press/v70/mhammedi17a.html.

Niu, Murphy Yuezhen, Lior Horesh, and Isaac Chuang. 2019. “Recurrent Neural Networks in the Eye of Differential Equations,” April. http://arxiv.org/abs/1904.12933.

Rackauckas, Christopher. 2019. “The Essential Tools of Scientific Machine Learning (Scientific ML).” The Winnower, August. https://doi.org/10.15200/winn.156631.13064.

Rackauckas, Christopher, Yingbo Ma, Vaibhav Dixit, Xingjian Guo, Mike Innes, Jarrett Revels, Joakim Nyberg, and Vijay Ivaturi. 2018. “A Comparison of Automatic Differentiation and Continuous Sensitivity Analysis for Derivatives of Differential Equation Solutions,” December. http://arxiv.org/abs/1812.01892.

Rackauckas, Christopher, Yingbo Ma, Julius Martensen, Collin Warner, Kirill Zubov, Rohit Supekar, Dominic Skinner, and Ali Ramadhan. 2020. “Universal Differential Equations for Scientific Machine Learning,” January. https://arxiv.org/abs/2001.04385v1.

Roeder, Geoffrey, Paul K. Grant, Andrew Phillips, Neil Dalchau, and Edward Meeds. 2019. “Efficient Amortised Bayesian Inference for Hierarchical and Nonlinear Dynamical Systems,” May. http://arxiv.org/abs/1905.12090.

Ruthotto, Lars, and Eldad Haber. 2018. “Deep Neural Networks Motivated by Partial Differential Equations,” April. http://arxiv.org/abs/1804.04272.

Saemundsson, Steindor, Alexander Terenin, Katja Hofmann, and Marc Peter Deisenroth. 2020. “Variational Integrator Networks for Physically Structured Embeddings,” March. http://arxiv.org/abs/1910.09349.

Singh, Gautam, Jaesik Yoon, Youngsung Son, and Sungjin Ahn. 2019. “Sequential Neural Processes,” June. http://arxiv.org/abs/1906.10264.

Şimşekli, Umut, Ozan Sener, George Deligiannidis, and Murat A. Erdogdu. 2020. “Hausdorff Dimension, Stochastic Differential Equations, and Generalization in Neural Networks,” June. http://arxiv.org/abs/2006.09313.

Vorontsov, Eugene, Chiheb Trabelsi, Samuel Kadoury, and Chris Pal. 2017. “On Orthogonality and Learning Recurrent Networks with Long Term Dependencies.” In PMLR, 3570–8. http://proceedings.mlr.press/v70/vorontsov17a.html.

Wang, Chuang, Hong Hu, and Yue M. Lu. 2019. “A Solvable High-Dimensional Model of GAN,” October. http://arxiv.org/abs/1805.08349.

Wiatowski, Thomas, and Helmut Bölcskei. 2015. “A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction.” In Proceedings of IEEE International Symposium on Information Theory. http://arxiv.org/abs/1512.06293.

Wiatowski, Thomas, Philipp Grohs, and Helmut Bölcskei. 2018. “Energy Propagation in Deep Convolutional Neural Networks.” IEEE Transactions on Information Theory 64 (7): 1–1. https://doi.org/10.1109/TIT.2017.2756880.

Yıldız, Çağatay, Markus Heinonen, and Harri Lähdesmäki. 2019. “ODE$2̂$VAE: Deep Generative Second Order ODEs with Bayesian Neural Networks,” October. http://arxiv.org/abs/1905.10994.

Zammit-Mangion, Andrew, and Christopher K. Wikle. 2020. “Deep Integro-Difference Equation Models for Spatio-Temporal Forecasting.” Spatial Statistics 37 (June): 100408. https://doi.org/10.1016/j.spasta.2020.100408.