Learning to approximate differential equations with neural nets.
Related: Analysing a neural net itself *as* a dynamical system, which is not quite the same but crosses over.
Variational state filters.

A deterministic version of this problem is what e.g. the famous Vector Institute Neural ODE paper (T. Q. Chen et al. 2018) did. Author Duvenaud argues that in some ways the hype ran away with the Neural ODE paper, and credits CasADI with the innovations.

There are various laypersons’ introductions/ tutorials in this area, including the simple and practical magical take in julia. See also the CASADI example.

Learning an ODE in particular a purely deterministic process, feels unsatisfying; We want a model which encodes responses,and effects to interactions. It is not ideal to have time series models which need to encode everything in an initial state.

Also, we would prefer models to be stochastic.
Learnable *SDEs* are probably what we want.
I’m particularly interested on
jump ODE regression.

Homework: Duvenaud again, tweeting some explanatory animations.

Note connection to reparameterization tricks, in that neural ODEs give you cheap differentiable reparameterizations.

Gu et al. (2021) unifies neural ODEs with RNNs.

## Questions

How do you do ensemble training for posterior predictives in NODEs? How do you guarantee stability in the learned dynamics?

## Incoming

Patrick Kidger’s thesis is the current canonical textbook on this domain [

Corenflos et al. (2021) describe an optimal transport method

Campbell et al. (2021) describes variational inference that factors out the unknown parameters.

HazyResearch/state-spaces: Sequence Modeling with Structured State Spaces (Gu et al. 2021) did what I was trying to do in learning gamelan, but better.

Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations (NDEs) are popular families of deep learning models for time-series data, each with unique strengths and tradeoffs in modeling power and computational efficiency. We introduce a simple sequence model inspired by control systems that generalizes these approaches while addressing their shortcomings. The Linear State-Space Layer (LSSL) maps a sequence u↦y by simply simulating a linear continuous-time state-space representation x˙=Ax+Bu,y=Cx+Du. Theoretically, we show that LSSL models are closely related to the three aforementioned families of models and inherit their strengths. For example, they generalize convolutions to continuous-time, explain common RNN heuristics, and share features of NDEs such as time-scale adaptation. We then incorporate and generalize recent theory on continuous-time memorization to introduce a trainable subset of structured matrices A that endow LSSLs with long-range memory. Empirically, stacking LSSL layers into a simple deep neural network obtains state-of-the-art results across time series benchmarks for long dependencies in sequential image classification, real-world healthcare regression tasks, and speech. On a difficult speech classification task with length-16000 sequences, LSSL outperforms prior approaches by 24 accuracy points, and even outperforms baselines that use hand-crafted features on 100x shorter sequences.

## References

*Mathematical Programming Computation*11 (1): 1–36.

*Acta Numerica*28 (May): 1–174.

*Proceedings of the National Academy of Sciences*111 (52): 18507–12.

*arXiv:1812.05916 [Math, q-Fin, Stat]*, January.

*Proceedings of ICLR*.

*PRoceedings of ICLR*.

*Nature Computational Science*2 (7): 433–42.

*Advances in Neural Information Processing Systems 31*, edited by S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, 6572–83. Curran Associates, Inc.

*Advances in Neural Information Processing Systems*. Vol. 33.

*arXiv:2102.07850 [Cs, Stat]*, June.

*Advances in Neural Information Processing Systems*. Vol. 33.

*Advances in Neural Information Processing Systems*. Vol. 32. Curran Associates, Inc.

*arXiv:1904.01681 [Cs, Stat]*, April.

*Communications in Mathematics and Statistics*5 (1): 1–11.

*Notices of the American Mathematical Society*68 (04): 1.

*arXiv:1807.01083 [Cs, Math]*, July.

*Scandinavian Journal of Statistics*n/a (n/a).

*ICML*, 14.

*Advances in Neural Information Processing Systems*. Vol. 33.

*arXiv:1807.01613 [Cs, Stat]*, July, 10.

*arXiv:1902.10298 [Cs]*, February.

*Advances in Neural Information Processing Systems*. Vol. 33.

*arXiv:2007.04154 [Cs, q-Fin, Stat]*, July.

*arXiv:1810.01367 [Cs, Stat]*, October.

*Advances in Neural Information Processing Systems*, 34:572–85. Curran Associates, Inc.

*arXiv:1805.08034 [Cs, Math]*, May.

*Proceedings of the National Academy of Sciences*115 (34): 8505–10.

*IMA Note*.

*Nature Machine Intelligence*4 (11): 992–1003.

*arXiv:2006.04439 [Cs, Stat]*, December.

*PRoceedings of ICLR*.

*Advances in Neural Information Processing Systems*. Vol. 33.

*arXiv:1812.04300 [Math, Stat]*, December.

*Advances in Neural Information Processing Systems 32*, edited by H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alché-Buc, E. Fox, and R. Garnett, 9847–58. Curran Associates, Inc.

*Advances in Neural Information Processing Systems*. Vol. 33.

*Advances In Neural Information Processing Systems*, 6.

*arXiv:2005.08926 [Cs, Stat]*, November.

*Machine Learning and the Physical Sciences Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS)*, 7.

*Advances in Neural Information Processing Systems*, 9.

*arXiv:2007.14823 [Cond-Mat, Physics:nlin, q-Bio]*.

*Advances in Neural Information Processing Systems*. Vol. 33.

*International Conference on Artificial Intelligence and Statistics*, 3870–82. PMLR.

*Advances in Neural Information Processing Systems*. Vol. 33.

*arXiv:1906.08324 [Cs, Stat]*, June.

*arXiv:1910.03193 [Cs, Stat]*, April.

*Advances in Neural Information Processing Systems*. Vol. 33.

*arXiv:2003.08063 [Cs, Math, Stat]*, March.

*arXiv:2002.08071 [Cs, Stat]*.

*PMLR*, 2401–9.

*arXiv:2109.00173 [Cs, Stat]*, August.

*arXiv:1904.12933 [Quant-Ph, Stat]*, April.

*arXiv:1905.10437 [Cs, Stat]*, February.

*Bulletin of the American Mathematical Society*80 (3): 503–5.

*Workshop on Bayesian Deep LEarning*, 7.

*International Conference on Artificial Intelligence and Statistics*, 1126–36. PMLR.

*Advances in Neural Information Processing Systems*. Vol. 33.

*arXiv:2009.09346 [Cs]*, September.

*The Winnower*.

*arXiv:1812.01892 [Cs]*, December.

*arXiv:2001.04385 [Cs, Math, q-Bio, Stat]*, August.

*ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*, 3905–9.

*arXiv:1905.12090 [Cs, Stat]*, May.

*arXiv:1804.04272 [Cs, Math, Stat]*, April.

*arXiv:1910.09349 [Cs, Stat]*, March.

*arXiv:2002.09405 [Physics, Stat]*.

*arXiv:2103.10153 [Cs, Stat]*, June.

*arXiv:2012.08405 [Cs, Eess]*, December.

*CoRR*abs/2006.09313.

*arXiv:1906.10264 [Cs, Stat]*, June.

*bioRxiv*, February, 272005.

*Physics-Based Deep Learning*. WWW.

*Proceedings of the Web Conference 2021*, 730–42. Ljubljana Slovenia: ACM.

*arXiv:1905.09883 [Cs, Stat]*, October.

*PMLR*, 3570–78.

*arXiv:1805.08349 [Cond-Mat, Stat]*, October.

*SIAM Journal on Scientific Computing*42 (1): A292–317.

*arXiv:1905.10994 [Cs, Stat]*, October.

*Spatial Statistics*37 (June): 100408.

*arXiv:1907.12998 [Cs, Stat]*, February.

*International Conference on Machine Learning*, 27060–74. PMLR.

## No comments yet. Why not leave one?