Nonparametrically learning dynamical systems



Learning to approximate differential equations with neural nets. Related: Analysing a neural net itself as a dynamical system, which is not quite the same but crosses over. Variational state filters.

A deterministic version of this problem is what e.g. the famous Vector Institute Neural ODE paper (T. Q. Chen et al. 2018) did. Author Duvenaud argues that in some ways the hype ran away with the Neural ODE paper, and credits CasADI with the innovations.

There are various laypersons’ introductions/ tutorials in this area, including the simple and practical magical take in julia. See also the CASADI example.

Learning an ODE in particular a purely deterministic process, feels unsatisfying; We want a model which encodes responses,and effects to interactions. It is not ideal to have time series models which need to encode everything in an initial state.

Also, we would prefer models to be stochastic. Learnable SDEs are probably what we want. I’m particularly interested on jump ODE regression.

Homework: Duvenaud again, tweeting some explanatory animations.

Note connection to reparameterization tricks, in that neural ODEs give you cheap differentiable reparameterizations.

Gu et al. (2021) unifies neural ODEs with RNNs.

Questions

How do you do ensemble training for posterior predictives in NODEs? How do you guarantee stability in the learned dynamics?

Incoming

  • Corenflos et al. (2021) describe an optimal transport method

  • Campbell et al. (2021) describes variational inference that factors out the unknown parameters.

  • HazyResearch/state-spaces: Sequence Modeling with Structured State Spaces (Gu et al. 2021) did what I was trying to do in learning gamelan, but better.

    Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations (NDEs) are popular families of deep learning models for time-series data, each with unique strengths and tradeoffs in modeling power and computational efficiency. We introduce a simple sequence model inspired by control systems that generalizes these approaches while addressing their shortcomings. The Linear State-Space Layer (LSSL) maps a sequence u↦y by simply simulating a linear continuous-time state-space representation x˙=Ax+Bu,y=Cx+Du. Theoretically, we show that LSSL models are closely related to the three aforementioned families of models and inherit their strengths. For example, they generalize convolutions to continuous-time, explain common RNN heuristics, and share features of NDEs such as time-scale adaptation. We then incorporate and generalize recent theory on continuous-time memorization to introduce a trainable subset of structured matrices A that endow LSSLs with long-range memory. Empirically, stacking LSSL layers into a simple deep neural network obtains state-of-the-art results across time series benchmarks for long dependencies in sequential image classification, real-world healthcare regression tasks, and speech. On a difficult speech classification task with length-16000 sequences, LSSL outperforms prior approaches by 24 accuracy points, and even outperforms baselines that use hand-crafted features on 100x shorter sequences.

References

Andersson, Joel A. E., Joris Gillis, Greg Horn, James B. Rawlings, and Moritz Diehl. 2019. CasADi: A Software Framework for Nonlinear Optimization and Optimal Control.” Mathematical Programming Computation 11 (1): 1–36.
Anil, Cem, James Lucas, and Roger Grosse. 2018. Sorting Out Lipschitz Function Approximation,” November.
Arridge, Simon, Peter Maass, Ozan Öktem, and Carola-Bibiane Schönlieb. 2019. Solving Inverse Problems Using Data-Driven Models.” Acta Numerica 28 (May): 1–174.
Babtie, Ann C., Paul Kirk, and Michael P. H. Stumpf. 2014. Topological Sensitivity Analysis for Systems Biology.” Proceedings of the National Academy of Sciences 111 (52): 18507–12.
Bachouch, Achref, Côme Huré, Nicolas Langrené, and Huyen Pham. 2020. Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon: Numerical Applications.” arXiv:1812.05916 [Math, q-Fin, Stat], January.
Campbell, Andrew, Yuyang Shi, Tom Rainforth, and Arnaud Doucet. 2021. Online Variational Filtering and Parameter Learning.” In.
Chang, Bo, Minmin Chen, Eldad Haber, and Ed H. Chi. 2019. AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks.” In Proceedings of ICLR.
Chang, Bo, Lili Meng, Eldad Haber, Frederick Tung, and David Begert. 2018. Multi-Level Residual Networks from Dynamical Systems View.” In PRoceedings of ICLR.
Chen, Boyuan, Kuang Huang, Sunand Raghupathi, Ishaan Chandratreya, Qiang Du, and Hod Lipson. 2022. Automated Discovery of Fundamental Variables Hidden in Experimental Data.” Nature Computational Science 2 (7): 433–42.
Chen, Tian Qi, and David K Duvenaud. n.d. “Neural Networks with Cheap Differential Operators,” 11.
Chen, Tian Qi, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. 2018. Neural Ordinary Differential Equations.” In Advances in Neural Information Processing Systems 31, edited by S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, 6572–83. Curran Associates, Inc.
Choromanski, Krzysztof, Jared Quincy Davis, Valerii Likhosherstov, Xingyou Song, Jean-Jacques Slotine, Jacob Varley, Honglak Lee, Adrian Weller, and Vikas Sindhwani. 2020. An Ode to an ODE.” In Advances in Neural Information Processing Systems. Vol. 33.
Corenflos, Adrien, James Thornton, George Deligiannidis, and Arnaud Doucet. 2021. Differentiable Particle Filtering via Entropy-Regularized Optimal Transport.” arXiv:2102.07850 [Cs, Stat], June.
Course, Kevin, Trefor Evans, and Prasanth Nair. 2020. Weak Form Generalized Hamiltonian Learning.” In Advances in Neural Information Processing Systems. Vol. 33.
Dupont, Emilien, Arnaud Doucet, and Yee Whye Teh. 2019. Augmented Neural ODEs.” arXiv:1904.01681 [Cs, Stat], April.
E, Weinan. 2017. A Proposal on Machine Learning via Dynamical Systems.” Communications in Mathematics and Statistics 5 (1): 1–11.
E, Weinan, Jiequn Han, and Qianxiao Li. 2018. A Mean-Field Optimal Control Formulation of Deep Learning.” arXiv:1807.01083 [Cs, Math], July.
Eguchi, Shoichi, and Yuma Uehara. n.d. Schwartz-Type Model Selection for Ergodic Stochastic Differential Equation Models.” Scandinavian Journal of Statistics n/a (n/a).
Finlay, Chris, Jörn-Henrik Jacobsen, Levon Nurbekyan, and Adam M Oberman. n.d. “How to Train Your Neural ODE: The World of Jacobian and Kinetic Regularization.” In ICML, 14.
Finzi, Marc, Ke Alexander Wang, and Andrew G. Wilson. 2020. Simplifying Hamiltonian and Lagrangian Neural Networks via Explicit Constraints.” In Advances in Neural Information Processing Systems. Vol. 33.
Garnelo, Marta, Dan Rosenbaum, Chris J. Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo J. Rezende, and S. M. Ali Eslami. 2018. Conditional Neural Processes.” arXiv:1807.01613 [Cs, Stat], July, 10.
Garnelo, Marta, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J. Rezende, S. M. Ali Eslami, and Yee Whye Teh. 2018. Neural Processes,” July.
Gholami, Amir, Kurt Keutzer, and George Biros. 2019. ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs.” arXiv:1902.10298 [Cs], February.
Ghosh, Arnab, Harkirat Behl, Emilien Dupont, Philip Torr, and Vinay Namboodiri. 2020. STEER : Simple Temporal Regularization For Neural ODE.” In Advances in Neural Information Processing Systems. Vol. 33.
Gierjatowicz, Patryk, Marc Sabate-Vidales, David Šiška, Lukasz Szpruch, and Žan Žurič. 2020. Robust Pricing and Hedging via Neural SDEs.” arXiv:2007.04154 [Cs, q-Fin, Stat], July.
Grathwohl, Will, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, and David Duvenaud. 2018. FFJORD: Free-Form Continuous Dynamics for Scalable Reversible Generative Models.” arXiv:1810.01367 [Cs, Stat], October.
Gu, Albert, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, and Christopher Ré. 2021. Combining Recurrent, Convolutional, and Continuous-Time Models with Linear State Space Layers.” In Advances in Neural Information Processing Systems, 34:572–85. Curran Associates, Inc.
Haber, Eldad, Felix Lucka, and Lars Ruthotto. 2018. Never Look Back - A Modified EnKF Method and Its Application to the Training of Neural Networks Without Back Propagation.” arXiv:1805.08034 [Cs, Math], May.
Han, Jiequn, Arnulf Jentzen, and Weinan E. 2018. Solving High-Dimensional Partial Differential Equations Using Deep Learning.” Proceedings of the National Academy of Sciences 115 (34): 8505–10.
Haro, A. 2008. Automatic Differentiation Methods in Computational Dynamical Systems: Invariant Manifolds and Normal Forms of Vector Fields at Fixed Points.” IMA Note.
Hasani, Ramin, Mathias Lechner, Alexander Amini, Lucas Liebenwein, Aaron Ray, Max Tschaikowski, Gerald Teschl, and Daniela Rus. 2022. Closed-Form Continuous-Time Neural Networks.” Nature Machine Intelligence 4 (11): 992–1003.
Hasani, Ramin, Mathias Lechner, Alexander Amini, Daniela Rus, and Radu Grosu. 2020. Liquid Time-Constant Networks.” arXiv:2006.04439 [Cs, Stat], December.
He, Junxian, Daniel Spokoyny, Graham Neubig, and Taylor Berg-Kirkpatrick. 2019. Lagging Inference Networks and Posterior Collapse in Variational Autoencoders.” In PRoceedings of ICLR.
Huh, In, Eunho Yang, Sung Ju Hwang, and Jinwoo Shin. 2020. Time-Reversal Symmetric ODE Network.” In Advances in Neural Information Processing Systems. Vol. 33.
Huré, Côme, Huyên Pham, Achref Bachouch, and Nicolas Langrené. 2018. Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon, Part I: Convergence Analysis.” arXiv:1812.04300 [Math, Stat], December.
Jia, Junteng, and Austin R Benson. 2019. Neural Jump Stochastic Differential Equations.” In Advances in Neural Information Processing Systems 32, edited by H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alché-Buc, E. Fox, and R. Garnett, 9847–58. Curran Associates, Inc.
Kaul, Shiva. 2020. Linear Dynamical Systems as a Core Computational Primitive.” In Advances in Neural Information Processing Systems. Vol. 33.
Kelly, Jacob, Jesse Bettencourt, Matthew James Johnson, and David Duvenaud. 2020. Learning Differential Equations That Are Easy to Solve.” In.
Kidger, Patrick, Ricky T Q Chen, and Terry Lyons. 2020. ‘Hey, That’s Not an ODE’: Faster ODE Adjoints with 12 Lines of Code.” In, 5.
Kidger, Patrick, James Foster, Xuechen Li, Harald Oberhauser, and Terry Lyons. 2020. “Neural SDEs Made Easy: SDEs Are Infinite-Dimensional GANS.” In Advances In Neural Information Processing Systems, 6.
Kidger, Patrick, James Morrill, James Foster, and Terry Lyons. 2020. Neural Controlled Differential Equations for Irregular Time Series.” arXiv:2005.08926 [Cs, Stat], November.
Kochkov, Dmitrii, Alvaro Sanchez-Gonzalez, Jamie Smith, Tobias Pfaff, Peter Battaglia, and Michael P Brenner. 2020. “Learning Latent FIeld Dynamics of PDEs.” In, 7.
Kolter, J Zico, and Gaurav Manek. 2019. Learning Stable Deep Dynamics Models.” In Advances in Neural Information Processing Systems, 9.
Krishnamurthy, Kamesh, Tankut Can, and David J. Schwab. 2020. Theory of Gating in Recurrent Neural Networks.” In arXiv:2007.14823 [Cond-Mat, Physics:nlin, q-Bio].
Lawrence, Nathan, Philip Loewen, Michael Forbes, Johan Backstrom, and Bhushan Gopaluni. 2020. Almost Surely Stable Deep Dynamics.” In Advances in Neural Information Processing Systems. Vol. 33.
Li, Xuechen, Ting-Kam Leonard Wong, Ricky T. Q. Chen, and David Duvenaud. 2020. Scalable Gradients for Stochastic Differential Equations.” In International Conference on Artificial Intelligence and Statistics, 3870–82. PMLR.
Li, Yuhong, Tianle Cai, Yi Zhang, Deming Chen, and Debadeepta Dey. 2022. What Makes Convolutional Models Great on Long Sequence Modeling? arXiv.
Lou, Aaron, Derek Lim, Isay Katsman, Leo Huang, Qingxuan Jiang, Ser Nam Lim, and Christopher M. De Sa. 2020. Neural Manifold Ordinary Differential Equations.” In Advances in Neural Information Processing Systems. Vol. 33.
Louizos, Christos, Xiahan Shi, Klamer Schutte, and Max Welling. 2019. The Functional Neural Process.” arXiv:1906.08324 [Cs, Stat], June.
Lu, Lu, Pengzhan Jin, and George Em Karniadakis. 2020. DeepONet: Learning Nonlinear Operators for Identifying Differential Equations Based on the Universal Approximation Theorem of Operators.” arXiv:1910.03193 [Cs, Stat], April.
Lu, Yulong, and Jianfeng Lu. 2020. A Universal Approximation Theorem of Deep Neural Networks for Expressing Probability Distributions.” In Advances in Neural Information Processing Systems. Vol. 33.
Massaroli, Stefano, Michael Poli, Michelangelo Bin, Jinkyoo Park, Atsushi Yamashita, and Hajime Asama. 2020. Stable Neural Flows.” arXiv:2003.08063 [Cs, Math, Stat], March.
Massaroli, Stefano, Michael Poli, Jinkyoo Park, Atsushi Yamashita, and Hajime Asama. 2020. Dissecting Neural ODEs.” In arXiv:2002.08071 [Cs, Stat].
Mhammedi, Zakaria, Andrew Hellicar, Ashfaqur Rahman, and James Bailey. 2017. Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections.” In PMLR, 2401–9.
Mishler, Alan, and Edward Kennedy. 2021. FADE: FAir Double Ensemble Learning for Observable and Counterfactual Outcomes.” arXiv:2109.00173 [Cs, Stat], August.
Morrill, James, Patrick Kidger, Cristopher Salvi, James Foster, and Terry Lyons. 2020. “Neural CDEs for Long Time Series via the Log-ODE Method.” In, 5.
Nguyen, Long, and Andy Malinsky. 2020. “Exploration and Implementation of Neural Ordinary Differential Equations,” 34.
Niu, Murphy Yuezhen, Lior Horesh, and Isaac Chuang. 2019. Recurrent Neural Networks in the Eye of Differential Equations.” arXiv:1904.12933 [Quant-Ph, Stat], April.
Norcliffe, Alexander, Cristian Bodnar, Ben Day, Jacob Moss, and Pietro Liò. 2020. Neural ODE Processes.” In.
Oreshkin, Boris N., Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio. 2020. N-BEATS: Neural Basis Expansion Analysis for Interpretable Time Series Forecasting.” arXiv:1905.10437 [Cs, Stat], February.
Palis, J. 1974. Vector Fields Generate Few Diffeomorphisms.” Bulletin of the American Mathematical Society 80 (3): 503–5.
Peluchetti, Stefano, and Stefano Favaro. 2019. “Neural SDE - Information Propagation Through the Lens of Diffusion Processes.” In Workshop on Bayesian Deep LEarning, 7.
———. 2020. Infinitely Deep Neural Networks as Diffusion Processes.” In International Conference on Artificial Intelligence and Statistics, 1126–36. PMLR.
Pfau, David, and Danilo Rezende. 2020. “Integrable Nonparametric Flows.” In, 7.
Poli, Michael, Stefano Massaroli, Atsushi Yamashita, Hajime Asama, and Jinkyoo Park. 2020a. Hypersolvers: Toward Fast Continuous-Depth Models.” In Advances in Neural Information Processing Systems. Vol. 33.
———. 2020b. TorchDyn: A Neural Differential Equations Library.” arXiv:2009.09346 [Cs], September.
Rackauckas, Christopher. 2019. The Essential Tools of Scientific Machine Learning (Scientific ML).” The Winnower.
Rackauckas, Christopher, Yingbo Ma, Vaibhav Dixit, Xingjian Guo, Mike Innes, Jarrett Revels, Joakim Nyberg, and Vijay Ivaturi. 2018. A Comparison of Automatic Differentiation and Continuous Sensitivity Analysis for Derivatives of Differential Equation Solutions.” arXiv:1812.01892 [Cs], December.
Rackauckas, Christopher, Yingbo Ma, Julius Martensen, Collin Warner, Kirill Zubov, Rohit Supekar, Dominic Skinner, Ali Ramadhan, and Alan Edelman. 2020. Universal Differential Equations for Scientific Machine Learning.” arXiv:2001.04385 [Cs, Math, q-Bio, Stat], August.
Revach, Guy, Nir Shlezinger, Ruud J. G. van Sloun, and Yonina C. Eldar. 2021. Kalmannet: Data-Driven Kalman Filtering.” In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3905–9.
Roeder, Geoffrey, Paul K. Grant, Andrew Phillips, Neil Dalchau, and Edward Meeds. 2019. Efficient Amortised Bayesian Inference for Hierarchical and Nonlinear Dynamical Systems.” arXiv:1905.12090 [Cs, Stat], May.
Ruthotto, Lars, and Eldad Haber. 2018. Deep Neural Networks Motivated by Partial Differential Equations.” arXiv:1804.04272 [Cs, Math, Stat], April.
Saemundsson, Steindor, Alexander Terenin, Katja Hofmann, and Marc Peter Deisenroth. 2020. Variational Integrator Networks for Physically Structured Embeddings.” arXiv:1910.09349 [Cs, Stat], March.
Sanchez-Gonzalez, Alvaro, Jonathan Godwin, Tobias Pfaff, Rex Ying, Jure Leskovec, and Peter W. Battaglia. 2020. Learning to Simulate Complex Physics with Graph Networks.” In arXiv:2002.09405 [Physics, Stat].
Schmidt, Jonathan, Nicholas Krämer, and Philipp Hennig. 2021. A Probabilistic State Space Model for Joint Inference from Differential Equations and Data.” arXiv:2103.10153 [Cs, Stat], June.
Shlezinger, Nir, Jay Whang, Yonina C. Eldar, and Alexandros G. Dimakis. 2020. Model-Based Deep Learning.” arXiv:2012.08405 [Cs, Eess], December.
Şimşekli, Umut, Ozan Sener, George Deligiannidis, and Murat A. Erdogdu. 2020. Hausdorff Dimension, Stochastic Differential Equations, and Generalization in Neural Networks.” CoRR abs/2006.09313.
Singh, Gautam, Jaesik Yoon, Youngsung Son, and Sungjin Ahn. 2019. Sequential Neural Processes.” arXiv:1906.10264 [Cs, Stat], June.
Stapor, Paul, Fabian Fröhlich, and Jan Hasenauer. 2018. Optimization and Uncertainty Analysis of ODE Models Using 2nd Order Adjoint Sensitivity Analysis.” bioRxiv, February, 272005.
Thuerey, Nils, Philipp Holl, Maximilian Mueller, Patrick Schnell, Felix Trost, and Kiwon Um. 2021. Physics-Based Deep Learning. WWW.
Tran, Alasdair, Alexander Mathews, Cheng Soon Ong, and Lexing Xie. 2021. Radflow: A Recurrent, Aggregated, and Decomposable Model for Networks of Time Series.” In Proceedings of the Web Conference 2021, 730–42. Ljubljana Slovenia: ACM.
Tzen, Belinda, and Maxim Raginsky. 2019. Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit.” arXiv:1905.09883 [Cs, Stat], October.
Vorontsov, Eugene, Chiheb Trabelsi, Samuel Kadoury, and Chris Pal. 2017. On Orthogonality and Learning Recurrent Networks with Long Term Dependencies.” In PMLR, 3570–78.
Wang, Chuang, Hong Hu, and Yue M. Lu. 2019. A Solvable High-Dimensional Model of GAN.” arXiv:1805.08349 [Cond-Mat, Stat], October.
Wang, Rui, Robin Walters, and Rose Yu. 2022. Data Augmentation Vs. Equivariant Networks: A Theory of Generalization on Dynamics Forecasting.” arXiv.
Wang, Sifan, Xinling Yu, and Paris Perdikaris. 2020. When and Why PINNs Fail to Train: A Neural Tangent Kernel Perspective,” July.
Yang, Liu, Dongkun Zhang, and George Em Karniadakis. 2020. Physics-Informed Generative Adversarial Networks for Stochastic Differential Equations.” SIAM Journal on Scientific Computing 42 (1): A292–317.
Yıldız, Çağatay, Markus Heinonen, and Harri Lähdesmäki. 2019. ODE\(^2\)VAE: Deep Generative Second Order ODEs with Bayesian Neural Networks.” arXiv:1905.10994 [Cs, Stat], October.
Zammit-Mangion, Andrew, and Christopher K. Wikle. 2020. Deep Integro-Difference Equation Models for Spatio-Temporal Forecasting.” Spatial Statistics 37 (June): 100408.
Zhang, Han, Xi Gao, Jacob Unterman, and Tom Arodz. 2020. Approximation Capabilities of Neural ODEs and Invertible Residual Networks.” arXiv:1907.12998 [Cs, Stat], February.
Zhi, Weiming, Tin Lai, Lionel Ott, Edwin V. Bonilla, and Fabio Ramos. 2022. Learning Efficient and Robust Ordinary Differential Equations via Invertible Neural Networks.” In International Conference on Machine Learning, 27060–74. PMLR.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.