Kalman filtering of Gaussian Processes

Classic flavours together, Gaussian processes and state filters/ stochastic differential equations.

I am interested here in the trick which makes certain Gaussian process regression problems soluble by making them local, i.e. Markov, with respect to some assumed hidden state, in the same way Kalman filtering does Wiener filtering. This means you get to solve a GP as an SDE. This trick is explained in an intro article in S. Särkkä, Solin, and Hartikainen (2013), based on previous work (Reece and Roberts 2010; Lindgren, Rue, and Lindström 2011; Särkkä and Hartikainen 2012; Hartikainen and Särkkä 2010; Solin 2016). The state of the art seems to be @ Recent extensions include (Karvonen and Särkkä 2016; Nickisch, Solin, and Grigorevskiy 2018). The idea is that if your covariance kernel is, or can be well approximated by, say, a rational function then it is possible to factorise it into a state space model tractably, which makes it cheap due to the favourable properties of such models. That sounds simple enough conceptually; I wonder about the practice. Possibly related, but I have not yet actually read: (Huber 2014).

This complements, perhaps, the trick of fast Gaussian process calculations on lattices.

To learn: Is this a classic graphical model-style decomposition into message passing via factor graph decompositions? Publications like (Cox, van de Laar, and de Vries 2019) are suggestive that it is, but I need to take a better look. So is there anything special going on here? It seems like there is something special here, in that standard factor graph decompositions are based on discrete nodes in a graph, whereas Gaussian processes give us a function over the entire input space; as such, this particular trick gives us an angle of attack for continuous graphical models which are of general interest.

There is another concept which is kind of a dual to filtering of a causal Gaussian process, which uses Gaussian processes to define the process dynamics or observation distribution. I have no use for that at the moment, but it pops up in the same keyword searches.

miscellaneous notes towards implementations

Chang, Paul E, William J Wilkinson, Mohammad Emtiyaz Khan, and Arno Solin. 2020. “Fast Variational Learning in State-Space Gaussian Process Models.” In MLSP, 6.

Cox, Marco, Thijs van de Laar, and Bert de Vries. 2019. “A Factor Graph Approach to Automated Design of Bayesian Signal Processing Algorithms.” International Journal of Approximate Reasoning 104 (January): 185–204. https://doi.org/10.1016/j.ijar.2018.11.002.

Cunningham, John P., Krishna V. Shenoy, and Maneesh Sahani. 2008. “Fast Gaussian Process Methods for Point Process Intensity Estimation.” In Proceedings of the 25th International Conference on Machine Learning, 192–99. ICML ’08. New York, NY, USA: ACM Press. https://doi.org/10.1145/1390156.1390181.

Curtain, Ruth F. 1975. “Infinite-Dimensional Filtering.” SIAM Journal on Control 13 (1): 89–104. https://doi.org/10.1137/0313005.

Eleftheriadis, Stefanos, Tom Nicholson, Marc Deisenroth, and James Hensman. 2017. “Identification of Gaussian Process State Space Models.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 5309–19. Curran Associates, Inc. http://papers.nips.cc/paper/7115-identification-of-gaussian-process-state-space-models.pdf.

Gilboa, E., Y. Saatçi, and J. P. Cunningham. 2015. “Scaling Multidimensional Inference for Structured Gaussian Processes.” IEEE Transactions on Pattern Analysis and Machine Intelligence 37 (2): 424–36. https://doi.org/10.1109/TPAMI.2013.192.

Gorad, Ajinkya, Zheng Zhao, and Simo Sarkka. 2020. “Parameter Estimation in Non-Linear State-Space Models by Automatic Differentiation of Non-Linear Kalman Filters.” In, 6.

Grigorievskiy, Alexander, and Juha Karhunen. 2016. “Gaussian Process Kernels for Popular State-Space Time Series Models.” In 2016 International Joint Conference on Neural Networks (IJCNN), 3354–63. Vancouver, BC, Canada: IEEE. https://doi.org/10.1109/IJCNN.2016.7727628.

Grigorievskiy, Alexander, Neil Lawrence, and Simo Särkkä. 2017. “Parallelizable Sparse Inverse Formulation Gaussian Processes (SpInGP).” In. http://arxiv.org/abs/1610.08035.

Hartikainen, J., and S. Särkkä. 2010. “Kalman Filtering and Smoothing Solutions to Temporal Gaussian Process Regression Models.” In 2010 IEEE International Workshop on Machine Learning for Signal Processing, 379–84. Kittila, Finland: IEEE. https://doi.org/10.1109/MLSP.2010.5589113.

Hensman, James, Nicolas Durrande, and Arno Solin. 2018. “Variational Fourier Features for Gaussian Processes.” Journal of Machine Learning Research 18 (151): 1–52. http://jmlr.org/papers/v18/16-579.html.

Huber, Marco F. 2014. “Recursive Gaussian Process: On-Line Regression and Learning.” Pattern Recognition Letters 45 (August): 85–91. https://doi.org/10.1016/j.patrec.2014.03.004.

Karvonen, Toni, and Simo Särkkä. 2016. “Approximate State-Space Gaussian Processes via Spectral Transformation.” In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), 1–6. Vietri sul Mare, Salerno, Italy: IEEE. https://doi.org/10.1109/MLSP.2016.7738812.

Lindgren, Finn, Håvard Rue, and Johan Lindström. 2011. “An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (4): 423–98. https://doi.org/10.1111/j.1467-9868.2011.00777.x.

Loeliger, Hans-Andrea, Justin Dauwels, Junli Hu, Sascha Korl, Li Ping, and Frank R. Kschischang. 2007. “The Factor Graph Approach to Model-Based Signal Processing.” Proceedings of the IEEE 95 (6): 1295–1322. https://doi.org/10.1109/JPROC.2007.896497.

Nickisch, Hannes, Arno Solin, and Alexander Grigorevskiy. 2018. “State Space Gaussian Processes with Non-Gaussian Likelihood.” In International Conference on Machine Learning, 3789–98. http://proceedings.mlr.press/v80/nickisch18a.html.

Rackauckas, Christopher, Yingbo Ma, Julius Martensen, Collin Warner, Kirill Zubov, Rohit Supekar, Dominic Skinner, and Ali Ramadhan. 2020. “Universal Differential Equations for Scientific Machine Learning,” January. https://arxiv.org/abs/2001.04385v1.

Reece, S., and S. Roberts. 2010. “An Introduction to Gaussian Processes for the Kalman Filter Expert.” In 2010 13th International Conference on Information Fusion, 1–9. https://doi.org/10.1109/ICIF.2010.5711863.

Remes, Sami, Markus Heinonen, and Samuel Kaski. 2017. “Non-Stationary Spectral Kernels.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 4642–51. Curran Associates, Inc. http://papers.nips.cc/paper/7050-non-stationary-spectral-kernels.pdf.

———. 2018. “Neural Non-Stationary Spectral Kernel,” November. http://arxiv.org/abs/1811.10978.

Saatçi, Yunus. 2012. “Scalable Inference for Structured Gaussian Process Models.” Ph.D., University of Cambridge. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.610016.

Särkkä, Simo. 2011. “Linear Operators and Stochastic Partial Differential Equations in Gaussian Process Regression.” In Artificial Neural Networks and Machine Learning – ICANN 2011, edited by Timo Honkela, Włodzisław Duch, Mark Girolami, and Samuel Kaski, 6792:151–58. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-21738-8_20.

———. 2013. Bayesian Filtering and Smoothing. Institute of Mathematical Statistics Textbooks 3. Cambridge, U.K. ; New York: Cambridge University Press. http://citeseerx.ist.psu.edu/viewdoc/download?doi=

Särkkä, Simo, and Jouni Hartikainen. 2012. “Infinite-Dimensional Kalman Filtering Approach to Spatio-Temporal Gaussian Process Regression.” In Artificial Intelligence and Statistics. http://www.jmlr.org/proceedings/papers/v22/sarkka12.html.

Särkkä, Simo, A. Solin, and J. Hartikainen. 2013. “Spatiotemporal Learning via Infinite-Dimensional Bayesian Filtering and Smoothing: A Look at Gaussian Process Regression Through Kalman Filtering.” IEEE Signal Processing Magazine 30 (4, 4): 51–61. https://doi.org/10.1109/MSP.2013.2246292.

Solin, Arno. 2016. “Stochastic Differential Equation Methods for Spatio-Temporal Gaussian Process Regression.” Aalto University. https://aaltodoc.aalto.fi:443/handle/123456789/19842.

Solin, Arno, and Simo Särkkä. 2020. “Hilbert Space Methods for Reduced-Rank Gaussian Process Regression.” Statistics and Computing 30 (2): 419–46. https://doi.org/10.1007/s11222-019-09886-w.

———. 2013. “Infinite-Dimensional Bayesian Filtering for Detection of Quasiperiodic Phenomena in Spatiotemporal Data.” Physical Review E 88 (5): 052909. https://doi.org/10.1103/PhysRevE.88.052909.

———. 2014. “Explicit Link Between Periodic Covariance Functions and State Space Models.” In Artificial Intelligence and Statistics, 904–12. http://proceedings.mlr.press/v33/solin14.html.

Tzinis, Efthymios, Zhepei Wang, and Paris Smaragdis. 2020. “Sudo Rm -Rf: Efficient Networks for Universal Audio Source Separation.” In, 6.

Wilkinson, William J., Michael Riis Andersen, Joshua D. Reiss, Dan Stowell, and Arno Solin. 2019. “End-to-End Probabilistic Inference for Nonstationary Audio Analysis,” January. https://arxiv.org/abs/1901.11436v1.

Wilkinson, William J, Paul E Chang, Michael Riis Andersen, and Arno Solin. 2019. “Global Approximate Inference via Local Linearisation for Temporal Gaussian Processes,” 12.

Wilkinson, W. J., M. Riis Andersen, J. D. Reiss, D. Stowell, and A. Solin. 2019. “Unifying Probabilistic Models for Time-Frequency Analysis.” In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3352–6. https://doi.org/10.1109/ICASSP.2019.8682306.