# Feedback system identification, not necessarily linear

Learning dynamics from data

August 1, 2016 — November 15, 2023

**The order in which this is presented right now makes no sense.**

If I have a system whose future evolution is important to predict, why not try to infer a plausible model instead of a convenient linear one?

To reconstruct the unobserved state, as opposed to the parameters of the process acting upon the state, we do state filtering. There can be interplay between these steps, if we are doing simulation-based online parameter inference, as in recursive estimation (what is the division between this and that?) Or: we might decide the state is unimportant and attempt to estimate the evolution only of the observations. That is the Koopman operator trick.

A compact overview is inserted incidentally in Cosma’s review of Fan and Yao (2003) wherein he also recommends (Bosq and Blanke 2007; Bosq 1998; Taniguchi and Kakizawa 2000).

There are many methods. From an engineering/control perspective, we have (Brunton, Proctor, and Kutz 2016), generalises the process for linear time series. to a sparse regression version via Indirect inference, or recursive hierarchical generalised linear models, which is an obvious way to generalise linear systems in the same way GLM generalizes linear models. Kitagawa and Gersch (1996) is popular in a Bayes context.

Hefny, Downey, and Gordon (2015):

We address […] these problems with a new view of predictive state methods for dynamical system learning. In this view, a dynamical system learning problem is reduced to a sequence of supervised learning problems. So, we can directly apply the rich literature on supervised learning methods to incorporate many types of prior knowledge about problem structure. We give a general convergence rate analysis that allows a high degree of flexibility in designing estimators. And finally, implementing a new estimator becomes as simple as rearranging our data and calling the appropriate supervised learning subroutines.

[…] More specifically, our contribution is to show that we can use much-more- general supervised learning algorithms in place of linear regression, and still get a meaningful theoretical analysis. In more detail:

we point out that we can equally well use any well-behaved supervised learning algorithm in place of linear regression in the first stage of instrumental-variable regression;

for the second stage of instrumental-variable regression, we generalize ordinary linear regression to its RKHS counterpart;

we analyze the resulting combination, and show that we get convergence to the correct answer, with a rate that depends on how quickly the individual supervised learners converge

State filters are cool for estimating time-varying hidden states given known fixed system parameters. How about learning those parameters of the model generating your states? Classic ways that you can do this in dynamical systems include basic linear system identification, and general system identification. But can you identify the fixed parameters (not just hidden states) with a state filter?

Yes. This is called recursive estimation.

### 0.1 Basic Construction

There are a few variations. We start with the basic continuous time state space model.

Here we have an unobserved Markov state process \(x(t)\) on \(\mathcal{X}\) and an observation process \(y(t)\) on \(\mathcal{Y}\). For now they will be assumed to be finite dimensional vectors over \(\mathbb{R}.\) They will additionally depend upon a vector of parameters \(\theta\) We observe the process at discrete times \(t(1:T)=(t_1, t_2,\dots, t_T),\) and we write the observations \(y(1:T)=(y(t_1), y(t_2),\dots, y(1_T)).\)

We presume our processes are completely specified by the following conditional densities (which might not have closed-form expression)

The transition density

\[f(x(t_i)|x(t_{i-1}), \theta)\]

The observation density…

TBC.

## 1 Method of adjoints

A trick in differentiation which happens to be useful in differentiating likelihood (or other functions) of time evolving systems using automatic differentiation. e.g. Errico (1997).

## 2 In particle filters

## 3 Indirect inference

The simulator is a black box and we have access only to its inputs and outputs. Popular. See simulation-based inference.

## 4 Learning SDEs

- Generalizing Automatic Differentiation to Automatic Sparsity, Uncertainty, Stability, and Parallelism
- mitmath/18303 course
- How can the general Green’s function of a linear homogeneous differential equation be derived?
- ICERM SciML conf
- JB_grothendieck_proof.pdf
- kar12Supple.pdf
- Kolmogorov Backward Equations
- mitmath/18303: 18.303 - Linear PDEs course
- ModelingToolkit, Modelica, and Modia: The Composable Modeling Future in Julia
- Neural SDEs: Deep Generative Models in the Diffusion Limit - Maxim Raginsky
- Seminars & Workshops in DATA DRIVEN SCIENCE & ENGINEERING

## 5 Tooling

- AaltoML/SDE: Example codes for the book Applied Stochastic Differential Equations
- An Implicit/Explicit CUDA-Accelerated Solver for the 2D Beeler-Reuter Model
- DiffEqFlux.jl – A Julia Library for Neural Differential Equations
- DiffEqFlux.jl: Generalized Physics-Informed and Scientific Machine Learning (SciML) · DiffEqFlux.jl
- DifferentialEquations.jl: Scientific Machine Learning (SciML) Enabled Simulation and Estimation · DifferentialEquations.jl
- Diffrax
- dnncode/Spatio-Temporal-Model-for-SPDE: Statistical Modeling for Spatio-Temporal Data from Physical Convection-Diffusion Processes
- JuliaSim
- NeuralNetDiffEq.jl: A Neural Network solver for ODEs
- Noise Processes · DifferentialEquations.jl
- Parameter Estimation and Bayesian Analysis · DifferentialEquations.jl
- Physics-Informed Neural Networks solver · NeuralPDE.jl
- SciML: Open Source Software for Scientific Machine Learning
- SciML/DiffEqFlux.jl: Universal neural differential equations with O(1) backprop, GPUs, and stiff+non-stiff DE solvers, demonstrating scientific machine learning (SciML) and physics-informed machine learning methods
- SciML/NeuralPDE.jl: Physics-Informed Neural Networks (PINN) and Deep BSDE Solvers of Differential Equations for Scientific Machine Learning (SciML) accelerated simulation
- SciML/SciMLTutorials.jl: Tutorials for doing scientific machine learning (SciML) and high-performance differential equation solving with open source software.
- Solving Systems of Stochastic PDEs and using GPUs in Julia
- willtebbutt/TemporalGPs.jl: Fast inference for Gaussian processes in problems involving time

## 6 Incoming

## 7 References

*arXiv:1802.09064 [Cs, Stat]*.

*Mathematical Programming Computation*.

*Handbook of Econometrics*.

*Journal of Time Series Analysis*.

*Acta Numerica*.

*IEEE transactions on neural networks and learning systems*.

*Advances in Neural Information Processing Systems 28*. NIPS’15.

*arXiv:2002.07928 [Physics, Stat]*.

*Nonparametric Statistics for Stochastic Processes: Estimation and Prediction*. Lecture Notes in Statistics 110.

*Inference and prediction in large dimensions*. Wiley series in probability and statistics.

*The Annals of Applied Statistics*.

*Proceedings of the National Academy of Sciences*.

*Computational Statistics & Data Analysis*.

*Compressed Sensing & Sparse Filtering*. Signals and Communication Technology.

*IEEE Transactions on Medical Imaging*.

*Journal of Time Series Analysis*.

*The Journal of Supercomputing*.

*Advances in Neural Information Processing Systems*.

*Advances in Neural Information Processing Systems 31*.

*Journal of Economic Surveys*.

*Advances in Neural Information Processing Systems*.

*Ecology*.

*Proceedings of the National Academy of Sciences*.

*arXiv:2102.07850 [Cs, Stat]*.

*Advances in Neural Information Processing Systems*.

*Advances in Neural Information Processing Systems*.

*arXiv:1304.5768 [Stat]*.

*Biometrika*.

*Time Series Analysis by State Space Methods*. Oxford Statistical Science Series 38.

*arXiv:1807.01083 [Cs, Math]*.

*Bulletin of the American Meteorological Society*.

*Ocean Dynamics*.

*Data Assimilation - The Ensemble Kalman Filter*.

*IEEE Control Systems*.

*Monthly Weather Review*.

*Nonlinear Time Series: Nonparametric and Parametric Methods*. Springer Series in Statistics.

*Annual Review of Statistics and Its Application*.

*arXiv:1606.08650 [Stat]*.

*ICML*.

*Advances in Neural Information Processing Systems*.

*arXiv:1704.04110 [Cs, Stat]*.

*Hidden Markov Models and Dynamical Systems*.

*arXiv:1902.10298 [Cs]*.

*Advances in Neural Information Processing Systems*.

*arXiv:1810.01367 [Cs, Stat]*.

*Advances in Neural Information Processing Systems*.

*arXiv:1805.08034 [Cs, Math]*.

*Encyclopedia of Biostatistics*.

*NIPS*.

*arXiv:1505.05310 [Cs, Stat]*.

*Journal of The Royal Society Interface*.

*Royal Society Open Science*.

*Review of Financial Studies*.

*International Journal of Systems Science*.

*Monthly Weather Review*.

*The Annals of Statistics*.

*Proceedings of the National Academy of Sciences*.

*Advances in Neural Information Processing Systems 32*.

*arXiv:1805.11122 [Cs, Stat]*.

*Journal of Econometrics*.

*Statistical Science*.

*Nonlinear Time Series Analysis*.

*Annual Review of Statistics and Its Application*.

*IEEE Transactions on Information Theory*.

*Ecological Monographs*.

*Proceedings of the 38th International Conference on Machine Learning*.

*arXiv:2005.08926 [Cs, Stat]*.

*Journal of the American Statistical Association*.

*Journal of Computational and Graphical Statistics*.

*Smoothness Priors Analysis of Time Series*. Lecture notes in statistics 116.

*Inverse Problems*.

*Physical Review. X*.

*Advances In Neural Information Processing Systems*.

*arXiv:1703.08596 [Cs, Math, Stat]*.

*Physica D: Nonlinear Phenomena*.

*arXiv:2107.10127 [Math, Stat]*.

*International Conference on Artificial Intelligence and Statistics*.

*Annual Reviews in Control*.

*Advances in Neural Information Processing Systems*.

*arXiv:2107.10879 [Physics]*.

*SPE Journal*.

*arXiv:2107.11253 [Nlin, Physics:physics, Stat]*.

*arXiv:2002.08071 [Cs, Stat]*.

*Monthly Weather Review*.

*Neural Computation*.

*IEEE Journal of Selected Topics in Signal Processing*.

*arXiv:1612.07197 [Math, Stat]*.

*arXiv:1612.09158 [Cs, Stat]*.

*Uncertainty in Artificial Intelligence : Proceedings of the … Conference. Conference on Uncertainty in Artificial Intelligence*.

*arXiv:2009.09346 [Cs]*.

*Stochastic systems: theory and applications*.

*arXiv:1812.01892 [Cs]*.

*arXiv.org*.

*Journal of Time Series Analysis*.

*arXiv:1905.12090 [Cs, Stat]*.

*IEEE Transactions on Signal Processing*.

*Physical Review E*.

*Journal of Mathematical Imaging and Vision*.

*Journal of Machine Learning Research*.

*SIAM Journal on Numerical Analysis*.

*arXiv:2103.10153 [Cs, Stat]*.

*Journal of Computational Physics*.

*Automatica*, Trends in System Identification,.

*The Annals of Applied Statistics*.

*bioRxiv*.

*Monthly Weather Review*.

*Journal of the American Statistical Association*.

*Asymptotic Theory of Statistical Inference for Time Series*. Springer Series in Statistics.

*Journal of Statistical Planning and Inference*.

*arXiv:1905.09883 [Cs, Stat]*.

*An Introduction to Sparse Stochastic Processes*.

*International Conference on Artificial Intelligence and Statistics*.

*Nuclear Engineering and Design*.

*arXiv:1711.11053 [Stat]*.

*Neural Networks*.

*Neural Computation*.

*Bayesian Analysis*.

*Spatial Statistics*.

*arXiv:1907.12998 [Cs, Stat]*.

*Water Resources Research*.