System identification in continuous time

Learning in continuous ODEs, SDEs and CDEs

2016-08-01 — 2023-11-29

calculus

dynamical systems

Hilbert space

Lévy processes

machine learning

neural nets

regression

sciml

SDEs

signal processing

sparser than thou

statistics

stochastic processes

time series

Suspiciously similar content

Learning the parameters of a dynamical system in continuous time. Gets extra tricky for stochastic systems.

1 Recursive estimation

See recursive identification for generic theory of learning a Markovian under the distribution shift induced by a moving unobserved state vector.

2 Introductory reading

See the writings of the indefatigable Chris Rackauckas (Rackauckas et al. 2018), plus his tutorial implementations and whole MIT course. Chris Rackauckas’ lecture notes christen this area “scientific machine learning,” which is a confusing and ambiguous choice, but let us run with it.

3 Basic SDEs

TBD.

4 In PDEs

Learning stochastic partial differential equations where a whole random field evolves in time is something of interest to me; see spatiotemporal nets and spatiotemporal dynamics for more on that theme.

See differentiable PDE solvers for now.

5 With sparse SDEs

For least-squares system identification see sparse stochastic processes.

6 Neural approximation

See neural learning dynamics.

7 Via simulations only

See simulation-based inference.

8 Controlled differential equations

TBD

9 Method of adjoints

A trick in differentiation which happens to be useful in differentiating likelihood (or other functions) of time-evolving systems e.g. Errico (1997). If the system is deterministic, this is not too bad. It is complicated in stochastic systems.

For now, see the method of adjoints in the autodiff notebook.

10 Tooling

10.1 Python

Diffrax

Diffrax is a JAX-based library providing numerical differential equation solvers.

Features include:

ODE/SDE/CDE (ordinary/stochastic/controlled) solvers

lots of different solvers (including Tsit5, Dopri8, symplectic solvers, implicit solvers)

vmappable everything (including the region of integration)

using a PyTree as the state

dense solutions

multiple adjoint methods for backpropagation

support for neural differential equations.

From a technical point of view, the internal structure of the library is pretty cool — all kinds of equations (ODEs, SDEs, CDEs) are solved in a unified way (rather than being treated separately), producing a small tightly-written library.

torchdyn (docs).

10.2 Julia

Chris Rackauckas is a veritable wizard with this stuff; read his blog.

Here is a tour of fun tricks with stochastic PDEs. There is a lot of tooling for this; DiffEqOperators … does something. DiffEqFlux (EZ neural ODEs works with Flux and claims to make neural SDE simple.

+1 for Julia here.

11 Incoming

Bayesian Model of Planetary Motion: exploring ideas for a modeling workflow when dealing with ordinary differential equations and multimodality

12 References

Anderson. 1982. “Reverse-Time Diffusion Equation Models.” Stochastic Processes and Their Applications.

Batz, Ruttor, and Opper. 2017. “Approximate Bayes Learning of Stochastic Differential Equations.” arXiv:1702.05390 [Physics, Stat].

Baydin, and Pearlmutter. 2014. “Automatic Differentiation of Algorithms for Machine Learning.” arXiv:1404.7456 [Cs, Stat].

Beck, E, and Jentzen. 2019. “Machine Learning Approximation Algorithms for High-Dimensional Fully Nonlinear Partial Differential Equations and Second-Order Backward Stochastic Differential Equations.” Journal of Nonlinear Science.

Chang, Chen, Haber, et al. 2019. “AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks.” In Proceedings of ICLR.

Chen, Rubanova, Bettencourt, et al. 2018. “Neural Ordinary Differential Equations.” In Advances in Neural Information Processing Systems 31.

Choromanski, Davis, Likhosherstov, et al. 2020. “An Ode to an ODE.” In Advances in Neural Information Processing Systems.

Dandekar, Chung, Dixit, et al. 2021. “Bayesian Neural Ordinary Differential Equations.” arXiv:2012.07244 [Cs].

Errico. 1997. “What Is an Adjoint Model?” Bulletin of the American Meteorological Society.

Gholami, Keutzer, and Biros. 2019. “ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs.” arXiv:1902.10298 [Cs].

Gierjatowicz, Sabate-Vidales, Šiška, et al. 2020. “Robust Pricing and Hedging via Neural SDEs.” arXiv:2007.04154 [Cs, q-Fin, Stat].

Grathwohl, Chen, Bettencourt, et al. 2018. “FFJORD: Free-Form Continuous Dynamics for Scalable Reversible Generative Models.” arXiv:1810.01367 [Cs, Stat].

Gu, Johnson, Goel, et al. 2021. “Combining Recurrent, Convolutional, and Continuous-Time Models with Linear State Space Layers.” In Advances in Neural Information Processing Systems.

Hirsh, Barajas-Solano, and Kutz. 2022. “Sparsifying Priors for Bayesian Uncertainty Quantification in Model Discovery.” Royal Society Open Science.

Jia, and Benson. 2019. “Neural Jump Stochastic Differential Equations.” In Advances in Neural Information Processing Systems 32.

Kelly, Bettencourt, Johnson, et al. 2020. “Learning Differential Equations That Are Easy to Solve.” In.

Kidger, Morrill, Foster, et al. 2020. “Neural Controlled Differential Equations for Irregular Time Series.” arXiv:2005.08926 [Cs, Stat].

Li, Yuhong, Cai, Zhang, et al. 2022. “What Makes Convolutional Models Great on Long Sequence Modeling?”

Li, Yang, and Duan. 2021. “Extracting Governing Laws from Sample Path Data of Non-Gaussian Stochastic Dynamical Systems.” arXiv:2107.10127 [Math, Stat].

Li, Xuechen, Wong, Chen, et al. 2020. “Scalable Gradients for Stochastic Differential Equations.” In International Conference on Artificial Intelligence and Statistics.

Ljung. 2010. “Perspectives on System Identification.” Annual Reviews in Control.

Lu, Ariño, and Soljačić. 2021. “Discovering Sparse Interpretable Dynamics from Partial Observations.” arXiv:2107.10879 [Physics].

Malartic, Farchi, and Bocquet. 2021. “State, Global and Local Parameter Estimation Using Local Ensemble Kalman Filters: Applications to Online Machine Learning of Chaotic Dynamics.” arXiv:2107.11253 [Nlin, Physics:physics, Stat].

Marelli. 2007. “A Functional Analysis Approach to Subband System Approximation and Identification.” IEEE Transactions on Signal Processing.

Massaroli, Poli, Park, et al. 2020. “Dissecting Neural ODEs.” In arXiv:2002.08071 [Cs, Stat].

Morrill, Kidger, Salvi, et al. 2020. “Neural CDEs for Long Time Series via the Log-ODE Method.” In.

Nabian, and Meidani. 2019. “A Deep Learning Solution Approach for High-Dimensional Random Differential Equations.” Probabilistic Engineering Mechanics.

Nguyen, and Malinsky. 2020. “Exploration and Implementation of Neural Ordinary Diﬀerential Equations.”

Pham, and Panaretos. 2016. “Methodology and Convergence Rates for Functional Time Series Regression.” arXiv:1612.07197 [Math, Stat].

Pillonetto. 2016. “The Interplay Between System Identification and Machine Learning.” arXiv:1612.09158 [Cs, Stat].

Rackauckas. 2019. “The Essential Tools of Scientific Machine Learning (Scientific ML).”

Rackauckas, Ma, Dixit, et al. 2018. “A Comparison of Automatic Differentiation and Continuous Sensitivity Analysis for Derivatives of Differential Equation Solutions.” arXiv:1812.01892 [Cs].

Rackauckas, Ma, Martensen, et al. 2020. “Universal Differential Equations for Scientific Machine Learning.” arXiv.org.

Ramsundar, Krishnamurthy, and Viswanathan. 2021. “Differentiable Physics: A Position Piece.” arXiv:2109.07573 [Physics].

Särkkä. 2011. “Linear Operators and Stochastic Partial Differential Equations in Gaussian Process Regression.” In Artificial Neural Networks and Machine Learning – ICANN 2011. Lecture Notes in Computer Science.

Särkkä, and Solin. 2019. Applied Stochastic Differential Equations. Institute of Mathematical Statistics Textbooks 10.

Schmidt, Krämer, and Hennig. 2021. “A Probabilistic State Space Model for Joint Inference from Differential Equations and Data.” arXiv:2103.10153 [Cs, Stat].

Solin. 2016. “Stochastic Differential Equation Methods for Spatio-Temporal Gaussian Process Regression.”

Song, Durkan, Murray, et al. 2021. “Maximum Likelihood Training of Score-Based Diffusion Models.” In Advances in Neural Information Processing Systems.

Song, Sohl-Dickstein, Kingma, et al. 2022. “Score-Based Generative Modeling Through Stochastic Differential Equations.” In.

Um, Brand, Fei, et al. 2021. “Solver-in-the-Loop: Learning from Differentiable Physics to Interact with Iterative PDE-Solvers.” arXiv:2007.00016 [Physics].

Um, and Holl. 2021. “Differentiable Physics for Improving the Accuracy of Iterative PDE-Solvers with Neural Networks.” In.

Unser, and Tafti. 2014. An Introduction to Sparse Stochastic Processes.

van Delft, and Eichler. 2016. “Locally Stationary Functional Time Series.” arXiv:1602.05125 [Math, Stat].

Wedig. 1984. “A Critical Review of Methods in Stochastic Structural Dynamics.” Nuclear Engineering and Design.

Werbos. 1988. “Generalization of Backpropagation with Application to a Recurrent Gas Market Model.” Neural Networks.

Yıldız, Heinonen, and Lähdesmäki. 2019. “ODE

^{2}

VAE: Deep Generative Second Order ODEs with Bayesian Neural Networks.” arXiv:1905.10994 [Cs, Stat].

Yoshida. 2022. “Quasi-Likelihood Analysis and Its Applications.” Statistical Inference for Stochastic Processes.

Zhang, Guo, and Karniadakis. 2020. “Learning in Modal Space: Solving Time-Dependent Stochastic PDEs Using Physics-Informed Neural Networks.” SIAM Journal on Scientific Computing.