Gaussian processes on lattices

\(\renewcommand{\var}{\operatorname{Var}} \renewcommand{\dd}{\mathrm{d}} \renewcommand{\pd}{\partial} \renewcommand{\bb}[1]{\mathbb{#1}} \renewcommand{\vv}[1]{\boldsymbol{#1}} \renewcommand{\mm}[1]{\mathrm{#1}} \renewcommand{\mmm}[1]{\mathrm{#1}} \renewcommand{\cc}[1]{\mathcal{#1}} \renewcommand{\oo}[1]{\operatorname{#1}} \renewcommand{\gvn}{\mid} \renewcommand{\II}[1]{\mathbb{I}\{#1\}} \renewcommand{\inner}[2]{\langle #1,#2\rangle} \renewcommand{\Inner}[2]{\left\langle #1,#2\right\rangle} \renewcommand{\norm}[1]{\| #1\|} \renewcommand{\Norm}[1]{\|\langle #1\right\|} \renewcommand{\argmax}{\mathop{\mathrm{argmax}}} \renewcommand{\argmin}{\mathop{\mathrm{argmin}}} \renewcommand{\omp}{\mathop{\mathrm{OMP}}}\)

Gaussian Processes with a stationary kernel are faster if you are working on a grid of points. I have not used this trick, but I understand it involves various simplifications arising from the structure of Gram matrices which end up being kronecker products of Toeplitz matrices under lexical ordering of the input points (sounds plausible but I have not worked this out on paper). The keyword to highlight this is Kronecker inference in the ML literature (e.g. Saatçi 2012; Flaxman et al. 2015; A. G. Wilson and Nickisch 2015). Another keyword is circulant embedding, the approximation of a Toeplitz matrix by a circulant matrix, which apparently enables one to leverage fast Fourier transforms to calculate some quantities of interest, and some other nice linear algebra properties besides. That would imply that this method is has its origins in Whittle likelihoods (P. Whittle 1953a; Peter Whittle 1952), if I am not mistaken.

Lattice GP methods complements, perhaps, the trick of filtering Gaussian processes, which can also exploit structure lattice inputs, although the setup is different between these.


Simulating Gaussian fields with the desired covariance structure

Following the introduction in (Dietrich and Newsam 1993):

Let’s say we wish to generate a stationary Gaussian process \(Y(x)\) on a points \(\Omega\). \(\Omega=(x_0, x_1,\dots, x_m)\).

Stationary in this context means that the covariance function \(r\) is translation-invariance and depend only on distance, so that it may be given \(r(|x|)\). Without loss of generality, we assume that \(\bb E[Y(x)]=0\) and \(\var[Y(x)]=1\).

The problem then reduces to generating a vector \(\vv y=(Y(x_0), Y(x_1), \dots, Y(x_m) )\sim \mathcal{N}(0, R)\) where \(R\) has entries \(R[p,q]=r(|x_p-x_q|).\)

Note that if \(\bb \varepsilon\sim\mathcal{N}(0, I)\) is an \(m+1\)-dimensional normal random variable, and \(AA^T=R\), then \(\vv y=\mm A \vv \varepsilon\) has the required distribution.

The circulant embedding trick

If we have additional structure, we can calculate this more efficiently.

Suppose further that our points form a grid, \(\Omega=(x_0, x_0+h,\dots, x_0+mh)\); specifically, equally-spaced-points on a line.

We know that \(R\) has a Toeplitz structure. Moreover it is non-negative definite, with \(\vv x^t\mm R \vv x \geq 0\forall \vv x.\) (Why?) πŸ—

J. T. Wilson et al. (2021) credits the following authors:

Well-known examples of this trend include banded and sparse matrices in the context of one-dimensional Gaussian processes and Gauss–Markov random fields [;Durrande et al. (2019)], as well as Kronecker and Toeplitz matrices when working with regularly-spaced grids X ∈ X (Dietrich and Newsam 1997; Grace Chan and Wood 1997).

Flaxman et al. (2015) deals with the lattice structure trick in a non-Gaussian likelihood setting using Laplace approximation.

In regression

OK, so what does this allow us to do with posterior inference in GPs? Apparently quite a lot. The KISS-GP method (A. G. Wilson and Nickisch 2015) presumably leverages something similar. So do Saatçi (2012) and Flaxman et al. (2015).



Abrahamsen, P., V. Kvernelv, and D. Barker. 2018. β€œSimulation Of Gaussian Random Fields Using The Fast Fourier Transform (Fft).” In, 2018:1–14. European Association of Geoscientists & Engineers.
Alexanderian, Alen. 2015. β€œA Brief Note on the Karhunen-LoΓ¨ve Expansion.” arXiv:1509.07526 [Math], October.
Chan, Grace, and Andrew T.A. Wood. 1997. β€œAlgorithm AS 312: An Algorithm for Simulating Stationary Gaussian Random Fields.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 46 (1): 171–81.
Chan, G., and A. T. A. Wood. 1999. β€œSimulation of Stationary Gaussian Vector Fields.” Statistics and Computing 9 (4): 265–68.
Choromanski, Krzysztof, and Vikas Sindhwani. 2016. β€œRecycling Randomness with Structure for Sublinear Time Kernel Expansions.” arXiv:1605.09049 [Cs, Stat], May.
Choudhuri, Nidhan, Subhashis Ghosal, and Anindya Roy. 2004. β€œContiguity of the Whittle Measure for a Gaussian Time Series.” Biometrika 91 (1): 211–18.
Cotter, S. L., G. O. Roberts, A. M. Stuart, and D. White. 2013. β€œMCMC Methods for Functions: Modifying Old Algorithms to Make Them Faster.” Statistical Science 28 (3): 424–46.
Davies, Tilman M., and David Bryant. 2013. β€œOn Circulant Embedding for Gaussian Random Fields in R.” Journal of Statistical Software 55 (9).
Dietrich, C. R., and G. N. Newsam. 1993. β€œA Fast and Exact Method for Multidimensional Gaussian Stochastic Simulations.” Water Resources Research 29 (8): 2861–69.
β€”β€”β€”. 1997. β€œFast and Exact Simulation of Stationary Gaussian Processes Through Circulant Embedding of the Covariance Matrix.” SIAM Journal on Scientific Computing 18 (4): 1088–1107.
Durrande, Nicolas, Vincent Adam, Lucas Bordeaux, Stefanos Eleftheriadis, and James Hensman. 2019. β€œBanded Matrix Operators for Gaussian Markov Models in the Automatic Differentiation Era.” In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, 2780–89. PMLR.
Ellis, Robert L., and David C. Lay. 1992. β€œFactorization of Finite Rank Hankel and Toeplitz Matrices.” Linear Algebra and Its Applications 173 (August): 19–38.
Erhel, Jocelyne, Mestapha Oumouni, GΓ©raldine Pichot, and Franck Schoefs. n.d. β€œAnalysis of Continuous Spectral Method for Sampling Stationary Gaussian Random Fields,” 26.
Flaxman, Seth, Andrew Gordon Wilson, Daniel B Neill, Hannes Nickisch, and Alexander J Smola. 2015. β€œFast Kronecker Inference in Gaussian Processes with Non-Gaussian Likelihoods.” In, 10.
Gilboa, E., Y. SaatΓ§i, and J. P. Cunningham. 2015. β€œScaling Multidimensional Inference for Structured Gaussian Processes.” IEEE Transactions on Pattern Analysis and Machine Intelligence 37 (2): 424–36.
Graham, Ivan G., Frances Y. Kuo, Dirk Nuyens, Rob Scheichl, and Ian H. Sloan. 2017a. β€œAnalysis of Circulant Embedding Methods for Sampling Stationary Random Fields.” arXiv:1710.00751 [Math], October.
β€”β€”β€”. 2017b. β€œCirculant Embedding with QMC – Analysis for Elliptic PDE with Lognormal Coefficients.” arXiv:1710.09254 [Math], October.
Gray, Robert M. 2006. Toeplitz and Circulant Matrices: A Review. Vol. 2.
Guinness, Joseph, and Montserrat Fuentes. 2016. β€œCirculant Embedding of Approximate Covariances for Inference From Gaussian Data on Large Lattices.” Journal of Computational and Graphical Statistics 26 (1): 88–97.
Heinig, Georg, and Karla Rost. 2011. β€œFast Algorithms for Toeplitz and Hankel Matrices.” Linear Algebra and Its Applications 435 (1): 1–59.
Kroese, Dirk P., and Zdravko I. Botev. 2013. β€œSpatial Process Generation.” arXiv:1308.0399 [Stat], August.
Lang, Annika, and JΓΌrgen Potthoff. 2011. β€œFast Simulation of Gaussian Random Fields.” Monte Carlo Methods and Applications 17 (3).
Loper, Jackson, David Blei, John P. Cunningham, and Liam Paninski. 2021. β€œLinear-Time Inference for Gaussian Processes on One Dimension.” arXiv:2003.05554 [Cs, Stat], October.
Nowak, W., and A. Litvinenko. 2013. β€œKriging and Spatial Design Accelerated by Orders of Magnitude: Combining Low-Rank Covariance Approximations with FFT-Techniques.” Mathematical Geosciences 45 (4): 411–35.
Powell, Catherine E. 2014. β€œGenerating Realisations of Stationary Gaussian Random Fields by Circulant Embedding.” Matrix 2 (2): 1.
Rue, Havard. 2001. β€œFast Sampling of Gaussian Markov Random Fields.” Journal of the Royal Statistical Society. Series B (Statistical Methodology) 63 (2): 325–38.
Rue, HΓ₯vard, and Leonhard Held. 2005. Gaussian Markov Random Fields: Theory and Applications. Monographs on Statistics and Applied Probability 104. Boca Raton: Chapman & Hall/CRC.
SaatΓ§i, Yunus. 2012. β€œScalable inference for structured Gaussian process models.” Ph.D., University of Cambridge.
SaatΓ§i, Yunus, Ryan Turner, and Carl Edward Rasmussen. 2010. β€œGaussian Process Change Point Models.” In Proceedings of the 27th International Conference on International Conference on Machine Learning, 927–34. ICML’10. Madison, WI, USA: Omnipress.
Sigrist, Fabio, Hans R. KΓΌnsch, and Werner A. Stahel. 2015a. β€œSpate : An R Package for Spatio-Temporal Modeling with a Stochastic Advection-Diffusion Process.” Application/pdf. Journal of Statistical Software 63 (14).
β€”β€”β€”. 2015b. β€œStochastic Partial Differential Equation Based Modelling of Large Space-Time Data Sets.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 77 (1): 3–33.
Stroud, Jonathan R., Michael L. Stein, and Shaun Lysen. 2017. β€œBayesian and Maximum Likelihood Estimation for Gaussian Processes on an Incomplete Lattice.” Journal of Computational and Graphical Statistics 26 (1): 108–20.
Teichmann, Jakob, and Karl-Gerald van den Boogaart. 2016. β€œEfficient Simulation of Stationary Multivariate Gaussian Random Fields with Given Cross-Covariance.” Applied Mathematics 7 (17): 2183–94.
Whittle, P. 1954. β€œOn Stationary Processes in the Plane.” Biometrika 41 (3/4): 434–49.
Whittle, P. 1952. β€œTests of Fit in Time Series.” Biometrika 39 (3-4): 309–18.
β€”β€”β€”. 1953a. β€œThe Analysis of Multiple Stationary Time Series.” Journal of the Royal Statistical Society: Series B (Methodological) 15 (1): 125–39.
β€”β€”β€”. 1953b. β€œEstimation and Information in Stationary Time Series.” Arkiv FΓΆr Matematik 2 (5): 423–34.
Whittle, Peter. 1952. β€œSome Results in Time Series Analysis.” Scandinavian Actuarial Journal 1952 (1-2): 48–60.
Wilson, Andrew Gordon, Christoph Dann, and Hannes Nickisch. 2015. β€œThoughts on Massively Scalable Gaussian Processes.” arXiv:1511.01870 [Cs, Stat], November.
Wilson, Andrew Gordon, and Hannes Nickisch. 2015. β€œKernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP).” In Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37, 1775–84. ICML’15. Lille, France:
Wilson, James T., Viacheslav Borovitskiy, Alexander Terenin, Peter Mostowsky, and Marc Peter Deisenroth. 2021. β€œPathwise Conditioning of Gaussian Processes.” Journal of Machine Learning Research 22 (105): 1–47.
Ye, Ke, and Lek-Heng Lim. 2016. β€œEvery Matrix Is a Product of Toeplitz Matrices.” Foundations of Computational Mathematics 16 (3): 577–98.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.