Variational state filtering



A placeholder to discuss state filtering and parameter estimation where the unobserved state is quantified by variationally-learned distributions.

Campbell et al. (2021) introduce an elegant method which also performs system identification. I would like to have time to go into more detail about this but for now I will present the key insight without adequate explanation for my own benefit. The neat trick here is that the variational approximation is in a sense global, in that it all telescopes into one big variational approximation, rather than a sequence of successive approximations, each of which accumulates a greater error inside the ELBO. Intuitively this gives us more hope that we are can avoid accumulating bias at each filter step.

\[ \max _{\theta, \phi} \mathcal{L}_{t}(\theta, \phi)=\mathbb{E}_{q_{t}^{\phi}\left(x_{1: t}\right)}\left[\log \frac{p_{\theta}\left(x_{1: t}, y^{t}\right)}{q_{t}^{\phi}\left(x_{1: t}\right)}\right] \]

Our key factorization: \(q_{t}^{\phi}\left(x_{1: t}\right)=q_{t}^{\phi}\left(x_{t}\right) q_{t}^{\phi}\left(x_{t-1} \mid x_{t}\right) q_{t-1}^{\phi}\left(x_{t-2} \mid x_{t-1}\right) \ldots q_{2}^{\phi}\left(x_{1} \mid x_{2}\right)\)

True factorization: \(p_{\theta}\left(x_{t} \mid y^{t}\right) p_{\theta}\left(x_{t-1} \mid x_{t}, y^{t-1}\right) p_{\theta}\left(x_{t-2} \mid x_{t-1}, y^{t-2}\right) \cdots p_{\theta}\left(x_{1} \mid x_{2}, y^{1}\right)\)

References

Archer, Evan, Il Memming Park, Lars Buesing, John Cunningham, and Liam Paninski. 2015. β€œBlack Box Variational Inference for State Space Models.” arXiv:1511.07367 [Stat], November.
Bayer, Justin, and Christian Osendorfer. 2014. β€œLearning Stochastic Recurrent Networks.” arXiv:1411.7610 [Cs, Stat], November.
Campbell, Andrew, Yuyang Shi, Tom Rainforth, and Arnaud Doucet. 2021. β€œOnline Variational Filtering and Parameter Learning.” In.
Chung, Junyoung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, and Yoshua Bengio. 2015. β€œA Recurrent Latent Variable Model for Sequential Data.” In Advances in Neural Information Processing Systems 28, edited by C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, 2980–88. Curran Associates, Inc.
Cox, Marco, Thijs van de Laar, and Bert de Vries. 2019. β€œA Factor Graph Approach to Automated Design of Bayesian Signal Processing Algorithms.” International Journal of Approximate Reasoning 104 (January): 185–204.
Damianou, Andreas, Michalis K. Titsias, and Neil D. Lawrence. 2011. β€œVariational Gaussian Process Dynamical Systems.” In Advances in Neural Information Processing Systems 24, edited by J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger, 2510–18. Curran Associates, Inc.
Doerr, Andreas, Christian Daniel, Martin Schiegg, Duy Nguyen-Tuong, Stefan Schaal, Marc Toussaint, and Sebastian Trimpe. 2018. β€œProbabilistic Recurrent State-Space Models.” arXiv:1801.10395 [Stat], January.
Drovandi, Christopher C., Anthony N. Pettitt, and Roy A. McCutchan. 2016. β€œExact and Approximate Bayesian Inference for Low Integer-Valued Time Series Models with Intractable Likelihoods.” Bayesian Analysis 11 (2): 325–52.
Eleftheriadis, Stefanos, Tom Nicholson, Marc Deisenroth, and James Hensman. 2017. β€œIdentification of Gaussian Process State Space Models.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 5309–19. Curran Associates, Inc.
Fabius, Otto, and Joost R. van Amersfoort. 2014. β€œVariational Recurrent Auto-Encoders.” In Proceedings of ICLR.
FΓΆll, Roman, Bernard Haasdonk, Markus Hanselmann, and Holger Ulmer. 2017. β€œDeep Recurrent Gaussian Process with Variational Sparse Spectrum Approximation.” arXiv:1711.00799 [Stat], November.
Fortunato, Meire, Charles Blundell, and Oriol Vinyals. 2017. β€œBayesian Recurrent Neural Networks.” arXiv:1704.02798 [Cs, Stat], April.
Fraccaro, Marco, SΓΈ ren Kaae SΓΈ nderby, Ulrich Paquet, and Ole Winther. 2016. β€œSequential Neural Models with Stochastic Layers.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 2199–2207. Curran Associates, Inc.
Freitas, J. F. G. de, Mahesan Niranjan, A. H. Gee, and Arnaud Doucet. 1998. β€œSequential Monte Carlo Methods for Optimisation of Neural Network Models.” Cambridge University Engineering Department, Cambridge, England, Technical Report TR-328.
Frigola, Roger, Yutian Chen, and Carl Edward Rasmussen. 2014. β€œVariational Gaussian Process State-Space Models.” In Advances in Neural Information Processing Systems 27, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, 3680–88. Curran Associates, Inc.
Frigola, Roger, Fredrik Lindsten, Thomas B SchΓΆn, and Carl Edward Rasmussen. 2013. β€œBayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC.” In Advances in Neural Information Processing Systems 26, edited by C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, 3156–64. Curran Associates, Inc.
Friston, K. J. 2008. β€œVariational Filtering.” NeuroImage 41 (3): 747–66.
Gorad, Ajinkya, Zheng Zhao, and Simo sΓ€rkkΓ€. 2020. β€œParameter Estimation in Non-Linear State-Space Models by Automatic Differentiation of Non-Linear Kalman Filters.” In, 6.
Gu, Shixiang, Zoubin Ghahramani, and Richard E Turner. 2015. β€œNeural Adaptive Sequential Monte Carlo.” In Advances in Neural Information Processing Systems 28, edited by C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, 2629–37. Curran Associates, Inc.
Hoffman, Matt, David M. Blei, Chong Wang, and John Paisley. 2013. β€œStochastic Variational Inference.” arXiv:1206.7051 [Cs, Stat] 14 (1).
Hsu, Wei-Ning, Yu Zhang, and James Glass. 2017. β€œUnsupervised Learning of Disentangled and Interpretable Representations from Sequential Data.” In arXiv:1709.07902 [Cs, Eess, Stat].
Karl, Maximilian, Maximilian Soelch, Justin Bayer, and Patrick van der Smagt. 2016. β€œDeep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data.” In Proceedings of ICLR.
Ko, Jonathan, and Dieter Fox. 2009. β€œGP-BayesFilters: Bayesian Filtering Using Gaussian Process Prediction and Observation Models.” Autonomous Robots 27 (1): 75–90.
Kocijan, JuΕ‘, Agathe Girard, BlaΕΎ Banko, and Roderick Murray-Smith. 2005. β€œDynamic Systems Identification with Gaussian Processes.” Mathematical and Computer Modelling of Dynamical Systems 11 (4): 411–24.
Krishnan, Rahul G., Uri Shalit, and David Sontag. 2015. β€œDeep Kalman Filters.” arXiv Preprint arXiv:1511.05121.
β€”β€”β€”. 2017. β€œStructured Inference Networks for Nonlinear State Space Models.” In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2101–9.
Le, Tuan Anh, Maximilian Igl, Tom Jin, Tom Rainforth, and Frank Wood. 2017. β€œAuto-Encoding Sequential Monte Carlo.” arXiv Preprint arXiv:1705.10306.
Ljung, Lennart. 1998. β€œSystem Identification.” In Signal Analysis and Prediction, 163–73. Applied and Numerical Harmonic Analysis. BirkhΓ€user, Boston, MA.
Loeliger, Hans-Andrea, Justin Dauwels, Junli Hu, Sascha Korl, Li Ping, and Frank R. Kschischang. 2007. β€œThe Factor Graph Approach to Model-Based Signal Processing.” Proceedings of the IEEE 95 (6): 1295–1322.
Louizos, Christos, and Max Welling. 2016. β€œStructured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors.” In arXiv Preprint arXiv:1603.04733, 1708–16.
Maddison, Chris J., Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, Andriy Mnih, Arnaud Doucet, and Yee Whye Teh. 2017. β€œFiltering Variational Objectives.” arXiv Preprint arXiv:1705.09279.
Mattos, CΓ©sar Lincoln C., Zhenwen Dai, Andreas Damianou, Guilherme A. Barreto, and Neil D. Lawrence. 2017. β€œDeep Recurrent Gaussian Processes for Outlier-Robust System Identification.” Journal of Process Control, DYCOPS-CAB 2016, 60 (December): 82–94.
Mattos, CΓ©sar Lincoln C., Zhenwen Dai, Andreas Damianou, Jeremy Forth, Guilherme A. Barreto, and Neil D. Lawrence. 2016. β€œRecurrent Gaussian Processes.” In Proceedings of ICLR.
Naesseth, Christian A., Scott W. Linderman, Rajesh Ranganath, and David M. Blei. 2017. β€œVariational Sequential Monte Carlo.” arXiv Preprint arXiv:1705.11140.
Ranganath, Rajesh, Dustin Tran, Jaan Altosaar, and David Blei. 2016. β€œOperator Variational Inference.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 496–504. Curran Associates, Inc.
Ranganath, Rajesh, Dustin Tran, and David Blei. 2016. β€œHierarchical Variational Models.” In PMLR, 324–33.
Ryder, Thomas, Andrew Golightly, A. Stephen McGough, and Dennis Prangle. 2018. β€œBlack-Box Variational Inference for Stochastic Differential Equations.” arXiv:1802.03335 [Stat], February.
SΓ€rkkΓ€, S., and J. Hartikainen. 2013. β€œNon-Linear Noise Adaptive Kalman Filtering via Variational Bayes.” In 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 1–6.
SΓ€rkkΓ€, Simo, and A. Nummenmaa. 2009. β€œRecursive Noise Adaptive Kalman Filtering by Variational Bayesian Approximations.” IEEE Transactions on Automatic Control 54 (3): 596–600.
Titsias, Michalis, and Neil D. Lawrence. 2010. β€œBayesian Gaussian Process Latent Variable Model.” In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 844–51.
Turner, Ryan, Marc Deisenroth, and Carl Rasmussen. 2010. β€œState-Space Inference and Learning with Gaussian Processes.” In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 868–75.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.