# Spatial processes and statistics thereof

July 29, 2011 — January 28, 2022

data sets
Gaussian
Hilbert space
spatial
standards
statistics

Statistics on fields with index sets of more than one dimension of support and, frequently, an implicit 2-norm. Sometimes they are also time-indexed. Especially, for processes on a continuous index set with continuous state and undirected interaction. Sometimes over fancy manifolds, although often you can get away with plain old euclidean space, unless you are doing spatial statistics over the entire planet, which turns out to be curved. Lattice models are frequently considered spatial statistics, but more arbitrary graph structures usually get filed under undirected graphical models/random fields. For spatial point processes I will make a new notebook. Often we mean some kind of Gaussian process regression to handle spatial statistics, although the use of these tool in the ML and spatial literatures is weirdly disjoint. There are many other random fields we might also wish to infer that relate to spatial index sets, and these can be taxonomised as I notice their existence. There are lots of interesting problem with statistics on such fields. Consider the illustrative problem of declustering.

I’m curious about how spatial statistics generalise to high-dimensional fields such as fitness landscapes, loss functions, and embedding of network processes in space…

## 1 Kriging

The spatial statistics name for Gaussian process regression. Many complications arise in the spatial context as seen in, e.g. Bayesian inverse problems. Various cunning tricks are needed to make spatial GPs practical

A recent Fixed Rank Kriging paper summarised some approaches

Fixed rank kriging (FRK) is a spatial/spatio-temporal modeling and prediction framework that is scaleable, works well with large datasets, and can deal easily with data that have different spatial supports. FRK hinges on the use of a spatial random effects (SRE) model, in which a spatially correlated mean-zero random process is decomposed using a linear combination of spatial basis functions with random coefficients plus a term that captures the random process’ fine-scale variation. Dimensionality reduction through a relatively small number of basis functions ensures computationally efficient prediction, while the reconstructed spatial process is, in general, non-stationary. The SRE model has a spatial covariance function that is always nonnegative-definite and, because any (possibly non-orthogonal) basis functions can be used, it can be constructed so as to approximate standard families of covariance functions . For a detailed treatment of FRK, see Cressie and Johannesson (2008);Cressie, Shi, and Kang (2010);Nychka et al. (2015).[…]

A few variants of FRK have been developed to date, and the one that comes closest to the present software is LatticeKrig . LatticeKrig implements what we call a LatticeKrig (LTK) model, which is made up of Wendland basis functions (that have compact support) decompos- ing a spatially correlated process. LatticeKrig models use a Markov assumption to construct a precision matrix to describe the dependence between the coefficients of these basis functions. This, in turn, results in efficient computations and the potential use of a large number (> 10,000) of basis functions. LatticeKrig models do not cater for what we term fine-scale-process variation and, instead, the finest scale of the process is limited to the finest resolution of the basis functions used. The package INLA is a general-purpose package for model fitting and prediction. One advantage of INLA is that it contains functionality for fitting Gaussian processes that have covariance functions from the Matérn class (see Lindgren and Rue (2015), for details on the software interface) by approximating a stochastic partial differential equation (SPDE) using a Gaussian Markov random field (GMRF). Specifically, the process is decomposed using basis functions that are triangular ‘tent’ functions, and the coefficients of these basis functions are normally distributed with a sparse precision matrix. Thus, these models, which we term SPDE-GMRF models, share many of the features of LatticeKrig models. A key advantage of INLA over LatticeKrig is that once the spatial or spatio-temporal model is constructed, one has access to all the approximate-inference machinery and likelihood models available within the package.

Kang and Cressie (2011) develop Bayesian FRK; they keep the spatial basis functions fixed and put a prior distribution on K. The predictive-process approach of Banerjee et al. (2008) can also be seen as a type of Bayesian FRK, where the basis functions are constructed from the postulated covariance function of the spatial random effects and hence depend on parameters (see Katzfuss and Hammerling (2017), for an equivalence argument). An R package that implements predictive processes is spBayes . It allows for multivariate spatial or spatio-temporal processes, and Bayesian inference is carried out using Markov chain Monte Carlo (MCMC), thus allowing for a variety of likelihood models.

Spatial processes are unlikely to be truly Gaussian; we often justify the approximation with a Gaussian process by Laplace approximation.

## 2 Spatial point processes

A particular sub-case combining point processes with spatial statistics, now with its own notebook

## 3 Tools

All recommendations made to me and passed on here are offered unreviewed and unendorsed.

### 3.1 R

Spatial statistics in R is diverse enough to need its own notebook. Check there.

### 3.2 Python

Spatial statistics in Python is diverse enough to need its own notebook. Check there.

### 3.3 QGIS

QGIS is a free and open-source cross-platform desktop geographic information system (GIS) application that supports viewing, editing, and analysis of geospatial data.[3]

Notably it support embedded python. It seems to be a GIS system, in that it places geography first and statistics second.

## 4 References

Abrahamsen. 1997.
Anselin. 1995. Geographical Analysis.
Anselin, Cohen, Cook, et al. 2000.
Baddeley, Adrian, Rubak, and Turner. 2016. Spatial Point Patterns: Methodology and Applications with R. Champan & Hall/CRC Interdisciplinary Statistics Series.
Baddeley, A., Turner, Møller, et al. 2005. Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Banerjee, Gelfand, Finley, et al. 2008. Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Besag. 1974. Journal of the Royal Statistical Society. Series B (Methodological).
———. 1986. Journal of the Royal Statistical Society. Series B (Methodological).
Brémaud, Massoulié, and Ridolfi. 2005. Advances in Applied Probability.
Cressie. 1990. Mathematical Geology.
———. 2015. Statistics for Spatial Data.
Cressie, and Johannesson. 2008. Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Cressie, Sainsbury-Dale, and Zammit-Mangion. 2022. Annual Review of Statistics and Its Application.
Cressie, Shi, and Kang. 2010. Journal of Computational and Graphical Statistics.
Cressie, and Wikle. 2011. Statistics for Spatio-Temporal Data. Wiley Series in Probability and Statistics 2.0.
Davison, and Ortiz. 2019. arXiv:1910.14139 [Cs].
Diggle, and Ribeiro. 2007. Model-Based Geostatistics. Springer Series in Statistics.
Donoho, and Johnstone. 1994. Biometrika.
Finley, Banerjee, and Carlin. 2007. Journal of Statistical Software.
Finley, Banerjee, and Gelfand. 2015. Journal of Statistical Software.
Fuentes. 2006. Journal of Statistical Planning and Inference.
Haran. 2011. In Handbook of Markov Chain Monte Carlo.
Huang, and Ogata. 1999. Journal of Computational and Graphical Statistics.
Kang, and Cressie. 2011. Journal of the American Statistical Association.
Kang, Liu, and Cressie. 2009. Computational Statistics & Data Analysis.
Katzfuss, and Hammerling. 2017. Statistics and Computing.
Lindgren, and Rue. 2015. Journal of Statistical Software.
Liu, Ray, and Hooker. 2014. arXiv:1411.4681 [Math, Stat].
Lovelace, Nowosad, and Münchow. 2019. Geocomputation with R.
Mackay. 1995. Network: Computation in Neural Systems.
Mardia, and Marshall. 1984. Biometrika.
Mohler. 2013. The Annals of Applied Statistics.
Møller, and Torrisi. 2007. Statistics & Probability Letters.
Nguyen, Cressie, and Braverman. 2012. Journal of the American Statistical Association.
Nguyen, Katzfuss, Cressie, et al. 2014. Technometrics.
Nowak, and Litvinenko. 2013. Mathematical Geosciences.
Nychka, Bandyopadhyay, Hammerling, et al. 2015. Journal of Computational and Graphical Statistics.
Patterson, Levin, Staver, et al. 2020. SIAM Journal on Applied Dynamical Systems.
Pewsey, and García-Portugués. 2020. arXiv:2005.06889 [Stat].
Pollard. 2004. “Hammersley-Clifford Theorem for Markov Random Fields.”
Possolo. 1986. Department of StatisticsPreprints, University of Washington, Seattle.
Rey, and Anselin. 2010. In Handbook of Applied Spatial Analysis.
Richardson, and Domingos. 2006. Machine Learning.
Ripley, B. D. 1977. Journal of the Royal Statistical Society. Series B (Methodological).
———. 1981. Spatial Statistics.
Ripley, Brian D. 1988. Statistical inference for spatial processes.
Rosenberg, and Anderson. 2011. Methods in Ecology and Evolution.
Saichev, and Sornette. 2006. The European Physical Journal B.
Saparin, Gowin, Kurths, et al. 1998. Physical Review E.
Sidén. 2020. Scalable Bayesian Spatial Analysis with Gaussian Markov Random Fields. Linköping Studies in Statistics.
Stein, Michael L. 2005. Journal of the American Statistical Association.
Stein, Michael L. 2008. Journal of the Korean Statistical Society.
Stein, Michael L., Chi, and Welty. 2004. Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Sun, and Stein. 2016. Journal of Computational and Graphical Statistics.
Whittle. 1954. “On Stationary Processes in the Plane.” Biometrika.
Zammit-Mangion, and Cressie. 2021. Journal of Statistical Software.
Zammit-Mangion, Ng, Vu, et al. 2021. Journal of the American Statistical Association.
Zammit-Mangion, and Rougier. 2019.