Kalman filtering of Gaussian Processes

Two classic flavours together, Gaussian processes and state filters/ stochastic differential equations.

I am interested here in the trick which makes certain Gaussian process regression problems soluble by making them local, i.e. Markov, with respect to some assumed hidden state, in the same way Kalman filtering does Wiener filtering. This trick is explained in an intro article in S. Särkkä, Solin, and Hartikainen (2013), based on previous work (Reece and Roberts 2010; Lindgren, Rue, and Lindström 2011; Särkkä and Hartikainen 2012; Hartikainen and Särkkä 2010; Solin 2016). Recent extensions include (Karvonen and Särkkä 2016; Nickisch, Solin, and Grigorevskiy 2018). The idea is that if your covariance kernel is, or can be well approximated by, say, a rational function then it is possible to factorise it into a state space model tractably, which makes it cheap due to the favourable properties of such models. That sounds simple enough conceptually; I wonder about the practice. Possibly related, but I have not yet actually read: (Huber 2014).

This complements, perhaps, the trick of fast Gaussian process calculations on lattices.

To learn: Is this a classic graphical model-style decomposition into message passing via factor graph decompositions? Publications like (Cox, van de Laar, and de Vries 2019) are suggestive that it is, but I need to take a better look. So is there anything special going on here? It seems like there is something special here, in that standard factor graph decompositions are based on discrete nodes in a graph, whereas Gaussian processes give us a function over the entire input space; as such, this particular trick gives us an angle of attack for continuous graphical models which are of general interest.

There is another concept which is kind of a dual to filtering of a causal Gaussian process, which uses Gaussian processes to define the process dynamics or observation distribution. I have no use for that at the moment, but it pops up in the same keyword searches.

miscellaneous notes towards implementations

Cox, Marco, Thijs van de Laar, and Bert de Vries. 2019. “A Factor Graph Approach to Automated Design of Bayesian Signal Processing Algorithms.” International Journal of Approximate Reasoning 104 (January): 185–204. https://doi.org/10.1016/j.ijar.2018.11.002.

Cunningham, John P., Krishna V. Shenoy, and Maneesh Sahani. 2008. “Fast Gaussian Process Methods for Point Process Intensity Estimation.” In Proceedings of the 25th International Conference on Machine Learning, 192–99. ICML ’08. New York, NY, USA: ACM Press. https://doi.org/10.1145/1390156.1390181.

Eleftheriadis, Stefanos, Tom Nicholson, Marc Deisenroth, and James Hensman. 2017. “Identification of Gaussian Process State Space Models.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 5309–19. Curran Associates, Inc. http://papers.nips.cc/paper/7115-identification-of-gaussian-process-state-space-models.pdf.

Gilboa, E., Y. Saatçi, and J. P. Cunningham. 2015. “Scaling Multidimensional Inference for Structured Gaussian Processes.” IEEE Transactions on Pattern Analysis and Machine Intelligence 37 (2): 424–36. https://doi.org/10.1109/TPAMI.2013.192.

Grigorievskiy, Alexander, and Juha Karhunen. 2016. “Gaussian Process Kernels for Popular State-Space Time Series Models.” In 2016 International Joint Conference on Neural Networks (IJCNN), 3354–63. Vancouver, BC, Canada: IEEE. https://doi.org/10.1109/IJCNN.2016.7727628.

Grigorievskiy, Alexander, Neil Lawrence, and Simo Särkkä. 2017. “Parallelizable Sparse Inverse Formulation Gaussian Processes (SpInGP).” In. http://arxiv.org/abs/1610.08035.

Hartikainen, J., and S. Särkkä. 2010. “Kalman Filtering and Smoothing Solutions to Temporal Gaussian Process Regression Models.” In 2010 IEEE International Workshop on Machine Learning for Signal Processing, 379–84. Kittila, Finland: IEEE. https://doi.org/10.1109/MLSP.2010.5589113.

Huber, Marco F. 2014. “Recursive Gaussian Process: On-Line Regression and Learning.” Pattern Recognition Letters 45 (August): 85–91. https://doi.org/10.1016/j.patrec.2014.03.004.

Karvonen, Toni, and Simo Särkkä. 2016. “Approximate State-Space Gaussian Processes via Spectral Transformation.” In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), 1–6. Vietri sul Mare, Salerno, Italy: IEEE. https://doi.org/10.1109/MLSP.2016.7738812.

Lindgren, Finn, Håvard Rue, and Johan Lindström. 2011. “An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (4): 423–98. https://doi.org/10.1111/j.1467-9868.2011.00777.x.

Loeliger, Hans-Andrea, Justin Dauwels, Junli Hu, Sascha Korl, Li Ping, and Frank R. Kschischang. 2007. “The Factor Graph Approach to Model-Based Signal Processing.” Proceedings of the IEEE 95 (6): 1295–1322. https://doi.org/10.1109/JPROC.2007.896497.

Nickisch, Hannes, Arno Solin, and Alexander Grigorevskiy. 2018. “State Space Gaussian Processes with Non-Gaussian Likelihood.” In International Conference on Machine Learning, 3789–98. http://proceedings.mlr.press/v80/nickisch18a.html.

Rackauckas, Christopher, Yingbo Ma, Julius Martensen, Collin Warner, Kirill Zubov, Rohit Supekar, Dominic Skinner, and Ali Ramadhan. 2020. “Universal Differential Equations for Scientific Machine Learning,” January. https://arxiv.org/abs/2001.04385v1.

Reece, S., and S. Roberts. 2010. “An Introduction to Gaussian Processes for the Kalman Filter Expert.” In 2010 13th International Conference on Information Fusion, 1–9. https://doi.org/10.1109/ICIF.2010.5711863.

Remes, Sami, Markus Heinonen, and Samuel Kaski. 2017. “Non-Stationary Spectral Kernels.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 4642–51. Curran Associates, Inc. http://papers.nips.cc/paper/7050-non-stationary-spectral-kernels.pdf.

———. 2018. “Neural Non-Stationary Spectral Kernel,” November. http://arxiv.org/abs/1811.10978.

Saatçi, Yunus. 2012. “Scalable Inference for Structured Gaussian Process Models.” Ph.D., University of Cambridge. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.610016.

Särkkä, Simo. 2013. Bayesian Filtering and Smoothing. Institute of Mathematical Statistics Textbooks 3. Cambridge, U.K. ; New York: Cambridge University Press. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.461.4042&rep=rep1&type=pdf.

Särkkä, Simo, and Jouni Hartikainen. 2012. “Infinite-Dimensional Kalman Filtering Approach to Spatio-Temporal Gaussian Process Regression.” In Artificial Intelligence and Statistics. http://www.jmlr.org/proceedings/papers/v22/sarkka12.html.

Särkkä, Simo, A. Solin, and J. Hartikainen. 2013. “Spatiotemporal Learning via Infinite-Dimensional Bayesian Filtering and Smoothing: A Look at Gaussian Process Regression Through Kalman Filtering.” IEEE Signal Processing Magazine 30 (4): 51–61. https://doi.org/10.1109/MSP.2013.2246292.

Solin, Arno. 2016. “Stochastic Differential Equation Methods for Spatio-Temporal Gaussian Process Regression.” Aalto University. https://aaltodoc.aalto.fi:443/handle/123456789/19842.

Solin, Arno, and Simo Särkkä. 2013. “Infinite-Dimensional Bayesian Filtering for Detection of Quasiperiodic Phenomena in Spatiotemporal Data.” Physical Review E 88 (5): 052909. https://doi.org/10.1103/PhysRevE.88.052909.

———. 2014. “Explicit Link Between Periodic Covariance Functions and State Space Models.” In Artificial Intelligence and Statistics, 904–12. http://proceedings.mlr.press/v33/solin14.html.

Wilkinson, William J., Michael Riis Andersen, Joshua D. Reiss, Dan Stowell, and Arno Solin. 2019. “End-to-End Probabilistic Inference for Nonstationary Audio Analysis,” January. https://arxiv.org/abs/1901.11436v1.

Wilkinson, William J, Paul E Chang, Michael Riis Andersen, and Arno Solin. 2019. “Global Approximate Inference via Local Linearisation for Temporal Gaussian Processes,” 12.

Wilkinson, W. J., M. Riis Andersen, J. D. Reiss, D. Stowell, and A. Solin. 2019. “Unifying Probabilistic Models for Time-Frequency Analysis.” In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3352–6. https://doi.org/10.1109/ICASSP.2019.8682306.