# Variational state filtering

March 19, 2018 — December 8, 2021

Bayes
dynamical systems
linear algebra
optimization
probability
signal processing
state space models
statistics
time series

A placeholder to discuss state filtering and parameter estimation where the unobserved state is quantified by variationally-learned distributions.

Campbell et al. (2021) introduce an elegant method which also performs system identification. I would like to have time to go into more detail about this but for now I will present the key insight without adequate explanation for my own benefit. The neat trick is that the variational approximation is in a sense global, in that it all telescopes into one big variational approximation, rather than a sequence of successive approximations, each of which accumulates a greater error inside the ELBO. Intuitively this gives us more hope that we are can avoid accumulating bias at each filter step.

$\max _{\theta, \phi} \mathcal{L}_{t}(\theta, \phi)=\mathbb{E}_{q_{t}^{\phi}\left(x_{1: t}\right)}\left[\log \frac{p_{\theta}\left(x_{1: t}, y^{t}\right)}{q_{t}^{\phi}\left(x_{1: t}\right)}\right]$

Our key factorization: $$q_{t}^{\phi}\left(x_{1: t}\right)=q_{t}^{\phi}\left(x_{t}\right) q_{t}^{\phi}\left(x_{t-1} \mid x_{t}\right) q_{t-1}^{\phi}\left(x_{t-2} \mid x_{t-1}\right) \ldots q_{2}^{\phi}\left(x_{1} \mid x_{2}\right)$$

True factorization: $$p_{\theta}\left(x_{t} \mid y^{t}\right) p_{\theta}\left(x_{t-1} \mid x_{t}, y^{t-1}\right) p_{\theta}\left(x_{t-2} \mid x_{t-1}, y^{t-2}\right) \cdots p_{\theta}\left(x_{1} \mid x_{2}, y^{1}\right)$$

## 1 References

Archer, Park, Buesing, et al. 2015. arXiv:1511.07367 [Stat].
Bannister. 2017. Quarterly Journal of the Royal Meteorological Society.
Bayer, and Osendorfer. 2014. arXiv:1411.7610 [Cs, Stat].
Campbell, Shi, Rainforth, et al. 2021. In.
Chung, Kastner, Dinh, et al. 2015. In Advances in Neural Information Processing Systems 28.
Cox, van de Laar, and de Vries. 2019. International Journal of Approximate Reasoning.
Damianou, Titsias, and Lawrence. 2011. In Advances in Neural Information Processing Systems 24.
de Freitas, Niranjan, Gee, et al. 1998. “Sequential Monte Carlo Methods for Optimisation of Neural Network Models.” Cambridge University Engineering Department, Cambridge, England, Technical Report TR-328.
Doerr, Daniel, Schiegg, et al. 2018. arXiv:1801.10395 [Stat].
Drovandi, Pettitt, and McCutchan. 2016. Bayesian Analysis.
Eleftheriadis, Nicholson, Deisenroth, et al. 2017. In Advances in Neural Information Processing Systems 30.
Fabius, and van Amersfoort. 2014. In Proceedings of ICLR.
Föll, Haasdonk, Hanselmann, et al. 2017. arXiv:1711.00799 [Stat].
Fortunato, Blundell, and Vinyals. 2017. arXiv:1704.02798 [Cs, Stat].
Fraccaro, Sø nderby, Paquet, et al. 2016. In Advances in Neural Information Processing Systems 29.
Frerix, Kochkov, Smith, et al. 2021. In.
Frigola, Chen, and Rasmussen. 2014. In Advances in Neural Information Processing Systems 27.
Frigola, Lindsten, Schön, et al. 2013. In Advances in Neural Information Processing Systems 26.
Friston. 2008. NeuroImage.
Gorad, Zhao, and Särkkä. 2020. “Parameter Estimation in Non-Linear State-Space Models by Automatic Differentiation of Non-Linear Kalman Filters.” In.
Gu, Ghahramani, and Turner. 2015. In Advances in Neural Information Processing Systems 28.
Hoffman, Blei, Wang, et al. 2013. arXiv:1206.7051 [Cs, Stat].
Hsu, Zhang, and Glass. 2017. In arXiv:1709.07902 [Cs, Eess, Stat].
Karl, Soelch, Bayer, et al. 2016. In Proceedings of ICLR.
Kocijan, Girard, Banko, et al. 2005. Mathematical and Computer Modelling of Dynamical Systems.
Ko, and Fox. 2009. In Autonomous Robots.
Krishnan, Shalit, and Sontag. 2015. arXiv Preprint arXiv:1511.05121.
———. 2017. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.
Kulhavý. 1990. Automatica.
Lai, Domke, and Sheldon. 2022. In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics.
Le, Igl, Jin, et al. 2017. arXiv Preprint arXiv:1705.10306.
Ljung. 1998. In Signal Analysis and Prediction. Applied and Numerical Harmonic Analysis.
Loeliger, Dauwels, Hu, et al. 2007. Proceedings of the IEEE.
Louizos, and Welling. 2016. In arXiv Preprint arXiv:1603.04733.
Maddison, Lawson, Tucker, et al. 2017. arXiv Preprint arXiv:1705.09279.
Mattos, Dai, Damianou, et al. 2016. In Proceedings of ICLR.
Mattos, Dai, Damianou, et al. 2017. Journal of Process Control, DYCOPS-CAB 2016,.
Naesseth, Linderman, Ranganath, et al. 2017. arXiv Preprint arXiv:1705.11140.
Ranganath, Tran, Altosaar, et al. 2016. In Advances in Neural Information Processing Systems 29.
Ranganath, Tran, and Blei. 2016. In PMLR.
Reller. 2013. Application/pdf.
Rozet, and Louppe. 2023.
Ryder, Golightly, McGough, et al. 2018. arXiv:1802.03335 [Stat].
Särkkä, S., and Hartikainen. 2013. In 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).
Särkkä, Simo, and Nummenmaa. 2009. IEEE Transactions on Automatic Control.
Schmidt, Krämer, and Hennig. 2021. arXiv:2103.10153 [Cs, Stat].
Titsias, and Lawrence. 2010. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics.
Turner, Deisenroth, and Rasmussen. 2010. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics.