Gaussian processes

2016-08-06 — 2021-06-23

Wherein Gaussian processes are presented as probability laws over functions on domains such as R^d, being specified by their mean and covariance kernel, and being employed in regression and spatial inference.

Gaussian

Hilbert space

kernel tricks

Lévy processes

nonparametric

regression

spatial

stochastic processes

time series

Assumed audience:

ML people

“Gaussian Processes” are stochastic processes/fields with joint Gaussian distributions over all finite sets of observation locations. The most familiar to finance and physics people is the Gauss-Markov process, a.k.a. the Wiener process, but there are many others. These processes are convenient due to certain useful properties of the multivariate Gaussian distribution, e.g., being uniquely specified by first and second moments, nice behaviour under various linear operations, and kernel tricks. Especially famous applications include Gaussian process regression and spatial statistics. Check out Ti’s Interactive visualization for some examples.

Figure 1: Gauss, with what I believe is possibly the telegraph he invented. That is not the Gaussian process I mean here — Gauss, after all, did not invent that — I just think it is cool.

Gaussian processes are, specifically, probabilistic distributions over random functions \(\mathcal{T}\to \mathbb{C}\) for some index (or argument) set \(\mathcal{T}\), often taken to be \(\mathcal{T}:=\mathbb{R}^d\).

We typically work with a mean-zero process, meaning for every finite set \(\mathbf{f}:=\{f(t_k);k=1,\dots,K\}\) of observations of that process, the joint distribution is mean-zero Gaussian, \[\begin{aligned} \mathbf{f}(t) &\sim \operatorname{GP}\left(0, \kappa(t, t';\mathbf{\theta})\right) \\ &\Rightarrow\\ p(\mathbf{f}) &=(2\pi)^{-{\frac {K}{2}}}\det({\boldsymbol {\mathrm{K} }})^{-{\frac {1}{2}}}\,e^{-{\frac {1}{2}}\mathbf {f}^{\!{\mathsf {T}}}{\boldsymbol {\mathrm{K} }}^{-1}\mathbf {f}}\\ &=\mathcal{N}(\mathbf{f};0, \mathrm{K}). \end{aligned}\] where \(\mathrm{K}\) is the sample covariance matrix defined such that its entries are given by \(\mathrm{K}_{jk}=\kappa(t_j,t_k).\) This is the covariance kernel that maps from function argument — \(t\) — to the second moment of function values. In this case, we are specifying only the second moments, which gives all the remaining properties of the process.

1 Simulation/generation

See GP simulation.

2 Derivatives and integrals

2.1 Integral of a Gaussian process

See stackexchange.

2.2 Derivative of a Gaussian process

TBD.

For now, see these blog posts:

I am using results from Adler (2010), Adler and Taylor (2007). See also pathwise GPs for some useful results here.

3 Exceedance probabilities

(Adler 2010; Adler and Taylor 2007; Chung 2020; Taylor 2009)

4 Incoming

5 References

Abrahamsen. 1997. “A Review of Gaussian Random Fields and Correlation Functions.”

Adler. 2010. The Geometry of Random Fields.

Adler, and Taylor. 2007. Random Fields and Geometry. Springer Monographs in Mathematics 115.

Agrell. 2019. “Gaussian Processes with Linear Operator Inequality Constraints.” Journal of Machine Learning Research.

Alexanderian. 2015. “A Brief Note on the Karhunen-Loève Expansion.” arXiv:1509.07526 [Math].

Bochner. 1959. Lectures on Fourier Integrals.

Chung. 2020. “Introduction to Random Fields.”

Dym, and McKean. 2008. Gaussian Processes, Function Theory, and the Inverse Spectral Problem. Dover Books on Mathematics.

Kanagawa, and Fukumizu. 2014. “Recovering Distributions from Gaussian RKHS Embeddings.” In Journal of Machine Learning Research.

Kanagawa, Hennig, Sejdinovic, et al. 2018. “Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences.” arXiv:1807.02582 [Cs, Stat].

Khintchine. 1934. “Korrelationstheorie der stationären stochastischen Prozesse.” Mathematische Annalen.

Lange-Hegermann. 2018. “Algorithmic Linearly Constrained Gaussian Processes.” In Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18.

———. 2021. “Linearly Constrained Gaussian Processes with Boundary Conditions.” In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research.

Liu, and Röckner. 2015. Stochastic Partial Differential Equations: An Introduction.

Long, Wang, Krishnapriyan, et al. 2022. “AutoIP: A United Framework to Integrate Physics into Gaussian Processes.” In Proceedings of the 39th International Conference on Machine Learning.

Lukić, and Beder. 2001. “Stochastic Processes with Sample Paths in Reproducing Kernel Hilbert Spaces.” Transactions of the American Mathematical Society.

Majumdar, and Majumdar. 2019. “On the Conditional Distribution of a Multivariate Normal Given a Transformation – the Linear Case.” Heliyon.

Papoulis. 1984. Probability, Random Variables and Stochastic Processes.

Rasmussen, and Nickisch. 2010. “Gaussian Processes for Machine Learning (GPML) Toolbox.” Journal of Machine Learning Research.

Rasmussen, and Williams. 2006. Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning.

Taylor. 2009. “Random Fields.”

Yaglom. 1987. Correlation Theory of Stationary and Related Random Functions. Volume II: Supplementary Notes and References. Springer Series in Statistics.

Zhang, Liu, Chen, et al. 2022. “On the Properties of Kullback-Leibler Divergence Between Multivariate Gaussian Distributions.”