Multi-output Gaussian process regression



Multi-task learning in GP regression by assuming the model is distributed as a multivariate Gaussian process.

WARNING: Under heavy construction ATM; and makes no sense.

My favourite introduction here is by Eric Perim, Wessel Bruinsma, and Will Tebbutt, in a series of blog posts spun of a paper (Bruinsma et al. 2020) which attempt to unify various approaches to defining vector GP processes . A unifying approach feels necessary; there is a lot of terminology going on here.

Now that I have this tool I am going to summarise it for myself to get a better understanding. It will probably supplant some of the older material below, and maybe also some of the GP factoring material.

To define in their terms: Co-regionalization…

The essential insight is that in practice we will probably assume a low-rank structure (to be defined) for the cross-covariance matrix, which will mean that it is some kind of linear mixing of scalar GPs. And there are only so many ways that can be done, as summarised in their diagram:

The Mixing Model Hierarchy summarizes lots of approaches

We could of course assume non-linear mixing, but then we will be doing some other things, perhaps variational autoencoding or Deep GPs.

Tooling

Most of the GP toolkits do multi-output as well.

Here is one with some interesting documentation.

This repository provides a toolkit to perform multi-output GP regression with kernels that are designed to utilize correlation information among channels in order to better model signals. The toolkit is mainly targeted to time-series, and includes plotting functions for the case of single input with multiple outputs (time series with several channels).

The main kernel corresponds to Multi Output Spectral Mixture Kernel, which correlates every pair of data points (irrespective of their channel of origin) to model the signals. This kernel is specified in detail in Parra and Tobar (2017).

References

Adler, Robert J., and Jonathan E. Taylor. 2007. Random Fields and Geometry. Springer Monographs in Mathematics 115. New York: Springer. https://doi.org/10.1007/978-0-387-48116-6.
Adler, Robert J, Jonathan E Taylor, and Keith J Worsley. 2016. Applications of Random Fields and Geometry Draft. https://robert.net.technion.ac.il/files/2016/08/hrf1.pdf.
Álvarez, Mauricio A., and Neil D. Lawrence. 2011. “Computationally Efficient Convolved Multiple Output Gaussian Processes.” Journal of Machine Learning Research 12 (41): 1459–1500. http://jmlr.org/papers/v12/alvarez11a.html.
Álvarez, Mauricio A., Lorenzo Rosasco, and Neil D. Lawrence. 2012. “Kernels for Vector-Valued Functions: A Review.” Foundations and Trends® in Machine Learning 4 (3): 195–266. https://doi.org/10.1561/2200000036.
Bonilla, Edwin V., Kian Ming A. Chai, and Christopher K. I. Williams. 2007. “Multi-Task Gaussian Process Prediction.” In Proceedings of the 20th International Conference on Neural Information Processing Systems, 153–60. NIPS’07. USA: Curran Associates Inc. http://dl.acm.org/citation.cfm?id=2981562.2981582.
Bruinsma, Wessel, Eric Perim, William Tebbutt, Scott Hosking, Arno Solin, and Richard Turner. 2020. “Scalable Exact Inference in Multi-Output Gaussian Processes.” In International Conference on Machine Learning, 1190–1201. PMLR. http://proceedings.mlr.press/v119/bruinsma20a.html.
Dai, Zhenwen, Mauricio Álvarez, and Neil Lawrence. 2017. “Efficient Modeling of Latent Information in Supervised Learning Using Gaussian Processes.” Advances in Neural Information Processing Systems 30: 5131–39. https://proceedings.neurips.cc/paper/2017/hash/1680e9fa7b4dd5d62ece800239bb53bd-Abstract.html.
Evgeniou, Theodoros, Charles A. Micchelli, and Massimiliano Pontil. 2005. “Learning Multiple Tasks with Kernel Methods.” Journal of Machine Learning Research 6 (Apr): 615–37. http://www.jmlr.org/papers/v6/evgeniou05a.html.
Evgeniou, Theodoros, and Massimiliano Pontil. 2004. “Regularized Multi–Task Learning.” In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 109–17. KDD ’04. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/1014052.1014067.
Gelfand, Alan, and Sudipto Banerjee. 2010. “Multivariate Spatial Process Models.” In Handbook of Spatial Statistics, edited by Alan Gelfand, Peter Diggle, Montserrat Fuentes, and Peter Guttorp, 20103158:495–515. CRC Press. https://doi.org/10.1201/9781420072884-c28.
Gneiting, Tilmann, William Kleiber, and Martin Schlather. 2010. “Matérn Cross-Covariance Functions for Multivariate Random Fields.” Journal of the American Statistical Association 105 (491): 1167–77. https://doi.org/10.1198/jasa.2010.tm09420.
Leibfried, Felix, Vincent Dutordoir, S. T. John, and Nicolas Durrande. 2021. “A Tutorial on Sparse Gaussian Processes and Variational Inference.” arXiv:2012.13962 [cs, Stat], June. http://arxiv.org/abs/2012.13962.
Micchelli, Charles A., and Massimiliano Pontil. 2005a. “Learning the Kernel Function via Regularization.” Journal of Machine Learning Research 6 (Jul): 1099–1125. http://www.jmlr.org/papers/v6/micchelli05a.html.
———. 2005b. “On Learning Vector-Valued Functions.” Neural Computation 17 (1): 177–204. https://doi.org/10.1162/0899766052530802.
Moreno-Muñoz, Pablo, Antonio Artés, and Mauricio Álvarez. 2018. “Heterogeneous Multi-Output Gaussian Process Prediction.” In Advances in Neural Information Processing Systems, edited by S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, 31:6711–20. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2018/file/165a59f7cf3b5c4396ba65953d679f17-Paper.pdf.
Moreno-Muñoz, Pablo, Antonio Artés-Rodríguez, and Mauricio A. Álvarez. 2019. “Continual Multi-Task Gaussian Processes.” arXiv:1911.00002 [cs, Stat], October. http://arxiv.org/abs/1911.00002.
Osborne, M. A., S. J. Roberts, A. Rogers, S. D. Ramchurn, and N. R. Jennings. 2008. “Towards Real-Time Information Processing of Sensor Network Data Using Computationally Efficient Multi-Output Gaussian Processes.” In 2008 International Conference on Information Processing in Sensor Networks (ipsn 2008), 109–20. https://doi.org/10.1109/IPSN.2008.25.
Parra, Gabriel, and Felipe Tobar. 2017. “Spectral Mixture Kernels for Multi-Output Gaussian Processes.” In Advances in Neural Information Processing Systems, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 30:6681–90. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/file/333cb763facc6ce398ff83845f224d62-Paper.pdf.
Schlather, Martin, Alexander Malinowski, Peter J. Menck, Marco Oesting, and Kirstin Strokorb. 2015. “Analysis, Simulation and Prediction of Multivariate Random Fields with Package Random Fields.” Journal of Statistical Software 63 (8): 1. https://doi.org/10.18637/jss.v063.i08.
Seeger, Matthias, Yee-Whye Teh, and Michael I Jordan. 2005. “Semiparametric Latent Factor Models,” 31. http://infoscience.epfl.ch/record/161465.
Stegle, Oliver, Christoph Lippert, Joris Mooij, Neil Lawrence, and Karsten Borgwardt. 2011. “Efficient Inference in Matrix-Variate Gaussian Models with Iid Observation Noise.” In Proceedings of the 24th International Conference on Neural Information Processing Systems, 630–38. NIPS’11. Red Hook, NY, USA: Curran Associates Inc. https://papers.nips.cc/paper/4281-efficient-inference-in-matrix-variate-gaussian-models-with-iid-observation-noise.pdf.
Williams, Christopher, Stefan Klanke, Sethu Vijayakumar, and Kian M. Chai. 2009. “Multi-Task Gaussian Process Learning of Robot Inverse Dynamics.” In Advances in Neural Information Processing Systems 21, edited by D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, 265–72. Curran Associates, Inc. http://papers.nips.cc/paper/3385-multi-task-gaussian-process-learning-of-robot-inverse-dynamics.pdf.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.