Neural learning dynamical systems

2018-08-13 — 2023-05-22

Suspiciously similar content

Learning to approximate differential equations and other interpretable physical dynamics with neural nets. Related: Analysing a neural net itself as a dynamical system, which is not quite the same but crosses over, or learning general recurrent dynamics. Variational state filters. Where the parameters are meaningful, not just weights, we tend to think about system identification.

A deterministic version of this problem is what e.g. the famous Vector Institute Neural ODE paper (T. Q. Chen et al. 2018) did. Author Duvenaud argues that in some ways the hype ran away with the Neural ODE paper, and credits CasADI with the innovations.

Video

Figure 2: Duvenaud’s NSDE

There are various laypersons’ introductions/ tutorials in this area, including the simple and practical magical take in julia. See also the CASADI example.

Learning an ODE, in particular a purely deterministic process, feels unsatisfying; We want a model which encodes responses and effects to interactions. It is not ideal to have time series models which need to encode everything in an initial state.

Also, we would prefer models to be stochastic. Learnable SDEs are probably what we want. I’m particularly interested in jump ODE regression.

Homework: Duvenaud again, tweeting some explanatory animations.

Note connection to reparameterization tricks, in that neural ODEs give you cheap differentiable reparameterizations.

Gu et al. (2021) unifies neural ODEs with RNNs.

1 Questions

How do you do ensemble training for posterior predictives in NODEs? How do you guarantee stability in the learned dynamics?

2 Recursive estimation

See recursive identification for generic theory of learning under the distribution shift induced by a moving parameter vector.

3 S4

Interesting package of tools from Christopher Ré’s lab, at the intersection of recurrent networks and linear feedback systems. See HazyResearch/state-spaces: Sequence Modeling with Structured State Spaces. I find these aesthetically satisfying, because I spent 2 years of my PhD trying to solve the same problem, and failed. These folks did a better job, so I find it slightly validating that the idea was not stupid.

See Recurrent/convolutional/state-space.

4 Incoming

google-research/torchsde: Differentiable SDE solvers with GPU support and efficient sensitivity analysis. (Kidger et al. 2021; X. Li et al. 2020)
Patrick Kidger’s thesis is the current canonical textbook on ODE learning (Kidger 2022).
Corenflos et al. (2021) describe an optimal transport method
Campbell et al. (2021) describes variational inference that factors out the unknown parameters.

5 References

Andersson, Gillis, Horn, et al. 2019. “CasADi: A Software Framework for Nonlinear Optimization and Optimal Control.” Mathematical Programming Computation.

Anil, Lucas, and Grosse. 2018. “Sorting Out Lipschitz Function Approximation.”

Arridge, Maass, Öktem, et al. 2019. “Solving Inverse Problems Using Data-Driven Models.” Acta Numerica.

Babtie, Kirk, and Stumpf. 2014. “Topological Sensitivity Analysis for Systems Biology.” Proceedings of the National Academy of Sciences.

Bachouch, Huré, Langrené, et al. 2020. “Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon: Numerical Applications.” arXiv:1812.05916 [Math, q-Fin, Stat].

Brunton, Brunton, Proctor, et al. 2016. “Koopman Invariant Subspaces and Finite Linear Representations of Nonlinear Dynamical Systems for Control.” PLOS ONE.

Campbell, Shi, Rainforth, et al. 2021. “Online Variational Filtering and Parameter Learning.” In.

Chang, Chen, Haber, et al. 2019. “AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks.” In Proceedings of ICLR.

Chang, Meng, Haber, et al. 2018. “Multi-Level Residual Networks from Dynamical Systems View.” In PRoceedings of ICLR.

Chen, Ricky T. Q., and Duvenaud. 2019. “Neural Networks with Cheap Differential Operators.” In Advances in Neural Information Processing Systems.

Chen, Boyuan, Huang, Raghupathi, et al. 2022. “Automated Discovery of Fundamental Variables Hidden in Experimental Data.” Nature Computational Science.

Chen, Tian Qi, Rubanova, Bettencourt, et al. 2018. “Neural Ordinary Differential Equations.” In Advances in Neural Information Processing Systems 31.

Choromanski, Davis, Likhosherstov, et al. 2020. “An Ode to an ODE.” In Advances in Neural Information Processing Systems.

Corenflos, Thornton, Deligiannidis, et al. 2021. “Differentiable Particle Filtering via Entropy-Regularized Optimal Transport.” arXiv:2102.07850 [Cs, Stat].

Course, Evans, and Nair. 2020. “Weak Form Generalized Hamiltonian Learning.” In Advances in Neural Information Processing Systems.

de Brouwer, Simm, Arany, et al. 2019. “GRU-ODE-Bayes: Continuous Modeling of Sporadically-Observed Time Series.” In Advances in Neural Information Processing Systems.

Dupont, Doucet, and Teh. 2019. “Augmented Neural ODEs.” arXiv:1904.01681 [Cs, Stat].

E. 2017. “A Proposal on Machine Learning via Dynamical Systems.” Communications in Mathematics and Statistics.

———. 2021. “The Dawning of a New Era in Applied Mathematics.” Notices of the American Mathematical Society.

Eguchi, and Uehara. n.d. “Schwartz-Type Model Selection for Ergodic Stochastic Differential Equation Models.” Scandinavian Journal of Statistics.

E, Han, and Li. 2018. “A Mean-Field Optimal Control Formulation of Deep Learning.” arXiv:1807.01083 [Cs, Math].

Finlay, Jacobsen, Nurbekyan, et al. n.d. “How to Train Your Neural ODE: The World of Jacobian and Kinetic Regularization.” In ICML.

Finzi, Wang, and Wilson. 2020. “Simplifying Hamiltonian and Lagrangian Neural Networks via Explicit Constraints.” In Advances in Neural Information Processing Systems.

Garnelo, Rosenbaum, Maddison, et al. 2018. “Conditional Neural Processes.” arXiv:1807.01613 [Cs, Stat].

Garnelo, Schwarz, Rosenbaum, et al. 2018. “Neural Processes.”

Gholami, Keutzer, and Biros. 2019. “ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs.” arXiv:1902.10298 [Cs].

Ghosh, Behl, Dupont, et al. 2020. “STEER : Simple Temporal Regularization For Neural ODE.” In Advances in Neural Information Processing Systems.

Gierjatowicz, Sabate-Vidales, Šiška, et al. 2020. “Robust Pricing and Hedging via Neural SDEs.” arXiv:2007.04154 [Cs, q-Fin, Stat].

Gilpin. 2023. “Model Scale Versus Domain Knowledge in Statistical Forecasting of Chaotic Systems.” Physical Review Research.

Grathwohl, Chen, Bettencourt, et al. 2018. “FFJORD: Free-Form Continuous Dynamics for Scalable Reversible Generative Models.” arXiv:1810.01367 [Cs, Stat].

Gu, Goel, and Ré. 2021. “Efficiently Modeling Long Sequences with Structured State Spaces.”

Gu, Johnson, Goel, et al. 2021. “Combining Recurrent, Convolutional, and Continuous-Time Models with Linear State Space Layers.” In Advances in Neural Information Processing Systems.

Haber, Lucka, and Ruthotto. 2018. “Never Look Back - A Modified EnKF Method and Its Application to the Training of Neural Networks Without Back Propagation.” arXiv:1805.08034 [Cs, Math].

Han, Jentzen, and E. 2018. “Solving High-Dimensional Partial Differential Equations Using Deep Learning.” Proceedings of the National Academy of Sciences.

Haro. 2008. “Automatic Differentiation Methods in Computational Dynamical Systems: Invariant Manifolds and Normal Forms of Vector Fields at Fixed Points.” IMA Note.

Hasani, Lechner, Amini, et al. 2020. “Liquid Time-Constant Networks.” arXiv:2006.04439 [Cs, Stat].

Hasani, Lechner, Amini, et al. 2022. “Closed-Form Continuous-Time Neural Networks.” Nature Machine Intelligence.

He, Spokoyny, Neubig, et al. 2019. “Lagging Inference Networks and Posterior Collapse in Variational Autoencoders.” In PRoceedings of ICLR.

Holzschuh, Vegetti, and Thuerey. 2022. “Score Matching via Differentiable Physics.”

Huh, Yang, Hwang, et al. 2020. “Time-Reversal Symmetric ODE Network.” In Advances in Neural Information Processing Systems.

Huré, Pham, Bachouch, et al. 2018. “Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon, Part I: Convergence Analysis.” arXiv:1812.04300 [Math, Stat].

Jia, and Benson. 2019. “Neural Jump Stochastic Differential Equations.” In Advances in Neural Information Processing Systems 32.

Kaul. 2020. “Linear Dynamical Systems as a Core Computational Primitive.” In Advances in Neural Information Processing Systems.

Kelly, Bettencourt, Johnson, et al. 2020. “Learning Differential Equations That Are Easy to Solve.” In.

Kidger. 2022. “On Neural Differential Equations.”

Kidger, Chen, and Lyons. 2021. “‘Hey, That’s Not an ODE’: Faster ODE Adjoints via Seminorms.” In Proceedings of the 38th International Conference on Machine Learning.

Kidger, Foster, Li, et al. 2021. “Neural SDEs as Infinite-Dimensional GANs.” In Proceedings of the 38th International Conference on Machine Learning.

Kidger, Morrill, Foster, et al. 2020. “Neural Controlled Differential Equations for Irregular Time Series.” arXiv:2005.08926 [Cs, Stat].

Kochkov, Sanchez-Gonzalez, Smith, et al. 2020. “Learning Latent FIeld Dynamics of PDEs.” In Machine Learning and the Physical Sciences Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS).

Kolter, and Manek. 2019. “Learning Stable Deep Dynamics Models.” In Advances in Neural Information Processing Systems.

Krishnamurthy, Can, and Schwab. 2022. “Theory of Gating in Recurrent Neural Networks.” Physical Review. X.

Laurent, and von Brecht. 2016. “A Recurrent Neural Network Without Chaos.” arXiv:1612.06212 [Cs].

Lawrence, Loewen, Forbes, et al. 2020. “Almost Surely Stable Deep Dynamics.” In Advances in Neural Information Processing Systems.

Li, Yuhong, Cai, Zhang, et al. 2022. “What Makes Convolutional Models Great on Long Sequence Modeling?”

Li, Xuechen, Wong, Chen, et al. 2020. “Scalable Gradients for Stochastic Differential Equations.” In International Conference on Artificial Intelligence and Statistics.

Louizos, Shi, Schutte, et al. 2019. “The Functional Neural Process.” In Advances in Neural Information Processing Systems.

Lou, Lim, Katsman, et al. 2020. “Neural Manifold Ordinary Differential Equations.” In Advances in Neural Information Processing Systems.

Lu, Lu, Jin, and Karniadakis. 2020. “DeepONet: Learning Nonlinear Operators for Identifying Differential Equations Based on the Universal Approximation Theorem of Operators.” arXiv:1910.03193 [Cs, Stat].

Lu, Yulong, and Lu. 2020. “A Universal Approximation Theorem of Deep Neural Networks for Expressing Probability Distributions.” In Advances in Neural Information Processing Systems.

Lusch, Kutz, and Brunton. 2018. “Deep Learning for Universal Linear Embeddings of Nonlinear Dynamics.” Nature Communications.

Massaroli, Poli, Bin, et al. 2020. “Stable Neural Flows.” arXiv:2003.08063 [Cs, Math, Stat].

Massaroli, Poli, Park, et al. 2020. “Dissecting Neural ODEs.” In arXiv:2002.08071 [Cs, Stat].

Mhammedi, Hellicar, Rahman, et al. 2017. “Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections.” In PMLR.

Mishler, and Kennedy. 2021. “FADE: FAir Double Ensemble Learning for Observable and Counterfactual Outcomes.” arXiv:2109.00173 [Cs, Stat].

Morrill, Kidger, Salvi, et al. 2020. “Neural CDEs for Long Time Series via the Log-ODE Method.” In.

Nguyen, and Malinsky. 2020. “Exploration and Implementation of Neural Ordinary Diﬀerential Equations.”

Niu, Horesh, and Chuang. 2019. “Recurrent Neural Networks in the Eye of Differential Equations.” arXiv:1904.12933 [Quant-Ph, Stat].

Norcliffe, Bodnar, Day, et al. 2020. “Neural ODE Processes.” In.

Oreshkin, Carpov, Chapados, et al. 2020. “N-BEATS: Neural Basis Expansion Analysis for Interpretable Time Series Forecasting.” arXiv:1905.10437 [Cs, Stat].

Ott, Katiyar, Hennig, et al. 2020. “ResNet After All: Neural ODEs and Their Numerical Solution.” In.

Palis. 1974. “Vector Fields Generate Few Diffeomorphisms.” Bulletin of the American Mathematical Society.

Peluchetti, and Favaro. 2019. “Neural SDE - Information Propagation Through the Lens of Diffusion Processes.” In Workshop on Bayesian Deep LEarning.

———. 2020. “Infinitely Deep Neural Networks as Diffusion Processes.” In International Conference on Artificial Intelligence and Statistics.

Pfau, and Rezende. 2020. “Integrable Nonparametric Flows.” In.

Poli, Massaroli, Yamashita, et al. 2020a. “Hypersolvers: Toward Fast Continuous-Depth Models.” In Advances in Neural Information Processing Systems.

———, et al. 2020b. “TorchDyn: A Neural Differential Equations Library.” arXiv:2009.09346 [Cs].

Rackauckas. 2019. “The Essential Tools of Scientific Machine Learning (Scientific ML).”

Rackauckas, Ma, Dixit, et al. 2018. “A Comparison of Automatic Differentiation and Continuous Sensitivity Analysis for Derivatives of Differential Equation Solutions.” arXiv:1812.01892 [Cs].

Rackauckas, Ma, Martensen, et al. 2020. “Universal Differential Equations for Scientific Machine Learning.” arXiv.org.

Ray, Pinti, and Oberai. 2023. “Deep Learning and Computational Physics (Lecture Notes).”

Revach, Shlezinger, van Sloun, et al. 2021. “Kalmannet: Data-Driven Kalman Filtering.” In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

Roeder, Grant, Phillips, et al. 2019. “Efficient Amortised Bayesian Inference for Hierarchical and Nonlinear Dynamical Systems.” arXiv:1905.12090 [Cs, Stat].

Ruthotto, and Haber. 2020. “Deep Neural Networks Motivated by Partial Differential Equations.” Journal of Mathematical Imaging and Vision.

Saemundsson, Terenin, Hofmann, et al. 2020. “Variational Integrator Networks for Physically Structured Embeddings.” arXiv:1910.09349 [Cs, Stat].

Sanchez-Gonzalez, Godwin, Pfaff, et al. 2020. “Learning to Simulate Complex Physics with Graph Networks.” In Proceedings of the 37th International Conference on Machine Learning.

Schirmer, Eltayeb, Lessmann, et al. 2022. “Modeling Irregular Time Series with Continuous Recurrent Units.”

Schmidt, Krämer, and Hennig. 2021. “A Probabilistic State Space Model for Joint Inference from Differential Equations and Data.” arXiv:2103.10153 [Cs, Stat].

Schotthöfer, Zangrando, Kusch, et al. 2022. “Low-Rank Lottery Tickets: Finding Efficient Low-Rank Neural Networks via Matrix Differential Equations.”

Shlezinger, Whang, Eldar, et al. 2020. “Model-Based Deep Learning.” arXiv:2012.08405 [Cs, Eess].

Sholokhov, Liu, Mansour, et al. 2023. “Physics-Informed Neural ODE (PINODE): Embedding Physics into Models Using Collocation Points.” Scientific Reports.

Simchoni, and Rosset. 2023. “Integrating Random Effects in Deep Neural Networks.”

Şimşekli, Sener, Deligiannidis, et al. 2020. “Hausdorff Dimension, Stochastic Differential Equations, and Generalization in Neural Networks.” CoRR.

Singh, Yoon, Son, et al. 2019. “Sequential Neural Processes.” arXiv:1906.10264 [Cs, Stat].

Stapor, Fröhlich, and Hasenauer. 2018. “Optimization and Uncertainty Analysis of ODE Models Using 2nd Order Adjoint Sensitivity Analysis.” bioRxiv.

Thuerey, Holl, Mueller, et al. 2021. Physics-Based Deep Learning.

Tran, Mathews, Ong, et al. 2021. “Radflow: A Recurrent, Aggregated, and Decomposable Model for Networks of Time Series.” In Proceedings of the Web Conference 2021.

Tzen, and Raginsky. 2019a. “Theoretical Guarantees for Sampling and Inference in Generative Models with Latent Diffusions.” In Proceedings of the Thirty-Second Conference on Learning Theory.

———. 2019b. “Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit.”

Vardasbi, Pires, Schmidt, et al. 2023. “State Spaces Aren’t Enough: Machine Translation Needs Attention.”

Vorontsov, Trabelsi, Kadoury, et al. 2017. “On Orthogonality and Learning Recurrent Networks with Long Term Dependencies.” In PMLR.

Wang, Chuang, Hu, and Lu. 2019. “A Solvable High-Dimensional Model of GAN.” arXiv:1805.08349 [Cond-Mat, Stat].

Wang, Rui, Walters, and Yu. 2022. “Data Augmentation Vs. Equivariant Networks: A Theory of Generalization on Dynamics Forecasting.”

Wang, Sifan, Yu, and Perdikaris. 2020. “When and Why PINNs Fail to Train: A Neural Tangent Kernel Perspective.”

Yang, Zhang, and Karniadakis. 2020. “Physics-Informed Generative Adversarial Networks for Stochastic Differential Equations.” SIAM Journal on Scientific Computing.

Yıldız, Heinonen, and Lähdesmäki. 2019. “ODE

^{2}

VAE: Deep Generative Second Order ODEs with Bayesian Neural Networks.” arXiv:1905.10994 [Cs, Stat].

Zammit-Mangion, and Wikle. 2020. “Deep Integro-Difference Equation Models for Spatio-Temporal Forecasting.” Spatial Statistics.

Zhang, Gao, Unterman, et al. 2020. “Approximation Capabilities of Neural ODEs and Invertible Residual Networks.” arXiv:1907.12998 [Cs, Stat].

Zhi, Lai, Ott, et al. 2022. “Learning Efficient and Robust Ordinary Differential Equations via Invertible Neural Networks.” In International Conference on Machine Learning.