Spatial processes and statistics thereof



Statistics on fields with index sets of more than one dimension of support and, frequently, an implicit 2-norm. Sometimes they are also time-indexed. Especially, for processes on a continuous index set with continuous state and undirected interaction. Sometimes over fancy manifolds, although often you can get away with plain old euclidean space, unless you are doing spatial statistics over the entire planet, which turns out to be curved. Lattice models are frequently considered spatial statistics, but more arbitrary graph structures usually get filed under undirected graphical models/random fields. For spatial point processes I will make a new notebook. Often we mean some kind of Gaussian process regression to handle spatial statistics, although the use of these tool in the ML and spatial literatures is weirdly disjoint. There are many other random fields we might also wish to infer that relate to spatial index sets, and these can be taxonomised as I notice their existence. There are lots of interesting problem with statistics on such fields. Consider the illustrative problem of declustering.

I’m curious about how spatial statistics generalise to high-dimensional fields such as fitness landscapes, loss functions, and embedding of network processes in space…

Kriging

The spatial statistics name for Gaussian process regression. Many complications arise in the spatial context as seen in, e.g. Bayesian inverse problems. Various cunning tricks are needed to make spatial GPs practical A recent one well-adapted to this context is Fixed Rank Kriging (Zammit-Mangion and Cressie 2021), which summarises a few approaches.

Fixed rank kriging (FRK) is a spatial/spatio-temporal modeling and prediction framework that is scaleable, works well with large datasets, and can deal easily with data that have different spatial supports. FRK hinges on the use of a spatial random effects (SRE) model, in which a spatially correlated mean-zero random process is decomposed using a linear combination of spatial basis functions with random coefficients plus a term that captures the random process’ fine-scale variation. Dimensionality reduction through a relatively small number of basis functions ensures computationally efficient prediction, while the reconstructed spatial process is, in general, non-stationary. The SRE model has a spatial covariance function that is always nonnegative-definite and, because any (possibly non-orthogonal) basis functions can be used, it can be constructed so as to approximate standard families of covariance functions (Kang and Cressie 2011). For a detailed treatment of FRK, see Cressie and Johannesson (2008); Cressie, Shi, and Kang (2010); Nychka et al. (2015).[…]

A few variants of FRK have been developed to date, and the one that comes closest to the present software is LatticeKrig (Nychka et al. 2015). LatticeKrig implements what we call a LatticeKrig (LTK) model, which is made up of Wendland basis functions (that have compact support) decompos- ing a spatially correlated process. LatticeKrig models use a Markov assumption to construct a precision matrix to describe the dependence between the coefficients of these basis functions. This, in turn, results in efficient computations and the potential use of a large number (> 10,000) of basis functions. LatticeKrig models do not cater for what we term fine-scale-process variation and, instead, the finest scale of the process is limited to the finest resolution of the basis functions used. The package INLA (Lindgren and Rue 2015) is a general-purpose package for model fitting and prediction. One advantage of INLA is that it contains functionality for fitting Gaussian processes that have covariance functions from the MatΓ©rn class (see Lindgren and Rue (2015), for details on the software interface) by approximating a stochastic partial differential equation (SPDE) using a Gaussian Markov random field (GMRF). Specifically, the process is decomposed using basis functions that are triangular β€˜tent’ functions, and the coefficients of these basis functions are normally distributed with a sparse precision matrix. Thus, these models, which we term SPDE-GMRF models, share many of the features of LatticeKrig models. A key advantage of INLA over LatticeKrig is that once the spatial or spatio-temporal model is constructed, one has access to all the approximate-inference machinery and likelihood models available within the package.

Kang and Cressie (2011) develop Bayesian FRK; they keep the spatial basis functions fixed and put a prior distribution on K. The predictive-process approach of Banerjee et al. (2008) can also be seen as a type of Bayesian FRK, where the basis functions are constructed from the postulated covariance function of the spatial random effects and hence depend on parameters (see Katzfuss and Hammerling (2017), for an equivalence argument). An R package that implements predictive processes is spBayes (Finley, Banerjee, and Gelfand 2015; Finley, Banerjee, and Carlin 2007). It allows for multivariate spatial or spatio-temporal processes, and Bayesian inference is carried out using Markov chain Monte Carlo (MCMC), thus allowing for a variety of likelihood models.

Spatial processes are unlikely to be truly Gaussian; we often justify the approximation with a Gaussian process by Laplace approximation.

Spatial point processes

A particular sub-case combining point processes with spatial statistics, now with its own notebook

Tools

All recommendations made to me and passed on here are offered unreviewed and unendorsed.

R

Spatial statistics in R is diverse enough to need its own notebook. Check there.

QGIS

QGIS is a free and open-source cross-platform desktop geographic information system (GIS) application that supports viewing, editing, and analysis of geospatial data.[3]

Notably it support embedded python. It seems to be a GIS system, in that it places geography first and statistics second.

Pysal

PySAL. Python. This seems to be a rich ecosystem; it is kind of dual to QGIS, in that it seems to put statistical analyses first and geography second. It has a lot of moving parts and it made of many libraries. Personally I am curious about their spatial Gibbs sampler.

References

Abrahamsen, Petter. 1997. β€œA Review of Gaussian Random Fields and Correlation Functions.”
Anselin, Luc. 1995. β€œLocal Indicators of Spatial Association - LISA.” Geographical Analysis 27 (2): 93–115.
Anselin, Luc, Jacqueline Cohen, David Cook, Wilpen Gorr, and George Tita. 2000. β€œSpatial Analyses of Crime.”
Baddeley, Adrian, Ege Rubak, and Rolf Turner. 2016. Spatial Point Patterns: Methodology and Applications with R. Champan & Hall/CRC Interdisciplinary Statistics Series. Boca Raton ; London ; New York: CRC Press, Taylor & Francis Group.
Baddeley, A., R. Turner, J. MΓΈller, and M. Hazelton. 2005. β€œResidual Analysis for Spatial Point Processes (with Discussion).” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (5): 617–66.
Banerjee, Sudipto, Alan E. Gelfand, Andrew O. Finley, and Huiyan Sang. 2008. β€œGaussian Predictive Process Models for Large Spatial Data Sets.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70 (4): 825–48.
Besag, Julian. 1974. β€œSpatial Interaction and the Statistical Analysis of Lattice Systems.” Journal of the Royal Statistical Society. Series B (Methodological) 36 (2): 192–236.
β€”β€”β€”. 1986. β€œOn the Statistical Analysis of Dirty Pictures.” Journal of the Royal Statistical Society. Series B (Methodological) 48 (3): 259–302.
Bolin, David. 2016. Models and Methods for Random Fields in Spatial Statistics with Computational Efficiency from Markov Properties.
BrΓ©maud, Pierre, Laurent MassouliΓ©, and Andrea Ridolfi. 2005. β€œPower Spectra of Random Spike Fields and Related Processes.” Advances in Applied Probability 37 (4): 1116–46.
Cressie, Noel. 1990. β€œThe Origins of Kriging.” Mathematical Geology 22 (3): 239–52.
β€”β€”β€”. 2015. Statistics for Spatial Data. John Wiley & Sons.
Cressie, Noel, and Gardar Johannesson. 2008. β€œFixed Rank Kriging for Very Large Spatial Data Sets.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70 (1): 209–26.
Cressie, Noel, Tao Shi, and Emily L. Kang. 2010. β€œFixed Rank Filtering for Spatio-Temporal Data.” Journal of Computational and Graphical Statistics 19 (3): 724–45.
Cressie, Noel, and Christopher K. Wikle. 2011. Statistics for Spatio-Temporal Data. Wiley Series in Probability and Statistics 2.0. John Wiley and Sons.
Davison, Andrew J., and Joseph Ortiz. 2019. β€œFutureMapping 2: Gaussian Belief Propagation for Spatial AI.” arXiv:1910.14139 [Cs], October.
Diggle, Peter, and Paulo J. Ribeiro. 2007. Model-Based Geostatistics. Springer Series in Statistics. New York, NY: Springer.
Donoho, David L., and Jain M. Johnstone. 1994. β€œIdeal Spatial Adaptation by Wavelet Shrinkage.” Biometrika 81 (3): 425–55.
Finley, Andrew O., Sudipto Banerjee, and Bradley P. Carlin. 2007. β€œspBayes: An R Package for Univariate and Multivariate Hierarchical Point-Referenced Spatial Models.” Journal of Statistical Software 19 (April): 1–24.
Finley, Andrew O., Sudipto Banerjee, and Alan E. Gelfand. 2015. β€œspBayes for Large Univariate and Multivariate Point-Referenced Spatio-Temporal Data Models.” Journal of Statistical Software 63 (February): 1–28.
Fuentes, Montserrat. 2006. β€œTesting for Separability of Spatial–Temporal Covariance Functions.” Journal of Statistical Planning and Inference 136 (2): 447–66.
Haran, Murali. 2011. β€œGaussian Random Field Models for Spatial Data.” In Handbook of Markov Chain Monte Carlo, edited by Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng. Vol. 20116022. Chapman and Hall/CRC.
Huang, Fuchun, and Yosihiko Ogata. 1999. β€œImprovements of the Maximum Pseudo-Likelihood Estimators in Various Spatial Statistical Models.” Journal of Computational and Graphical Statistics 8 (3): 510–30.
Kang, Emily L., and Noel Cressie. 2011. β€œBayesian Inference for the Spatial Random Effects Model.” Journal of the American Statistical Association 106 (495): 972–83.
Kang, Emily L., Desheng Liu, and Noel Cressie. 2009. β€œStatistical Analysis of Small-Area Data Based on Independence, Spatial, Non-Hierarchical, and Hierarchical Models.” Computational Statistics & Data Analysis 53 (8): 3016–32.
Katzfuss, Matthias, and Dorit Hammerling. 2017. β€œParallel Inference for Massive Distributed Spatial Data Using Low-Rank Models.” Statistics and Computing 27 (2): 363–75.
Lindgren, Finn, and HΓ₯vard Rue. 2015. β€œBayesian Spatial Modelling with R-INLA.” Journal of Statistical Software 63 (i19): 1–25.
Liu, Chong, Surajit Ray, and Giles Hooker. 2014. β€œFunctional Principal Components Analysis of Spatially Correlated Data.” arXiv:1411.4681 [Math, Stat], November.
Lovelace, Robin, Jakub Nowosad, and Jannes MΓΌnchow. 2019. Geocomputation with R. Boca Raton: Taylor & Francis.
Mackay, David J. C. 1995. β€œProbable Networks and Plausible Predictions β€” a Review of Practical Bayesian Methods for Supervised Neural Networks.” Network: Computation in Neural Systems 6 (3): 469–505.
Mardia, K. V., and R. J. Marshall. 1984. β€œMaximum Likelihood Estimation of Models for Residual Covariance in Spatial Regression.” Biometrika 71 (1): 135–46.
Mohler, George. 2013. β€œModeling and Estimation of Multi-Source Clustering in Crime and Security Data.” The Annals of Applied Statistics 7 (3): 1525–39.
MΓΈller, Jesper, and Giovanni Luca Torrisi. 2007. β€œThe Pair Correlation Function of Spatial Hawkes Processes.” Statistics & Probability Letters 77 (10): 995–1003.
Nguyen, Hai, Noel Cressie, and Amy Braverman. 2012. β€œSpatial Statistical Data Fusion for Remote Sensing Applications.” Journal of the American Statistical Association 107 (499): 1004–18.
Nguyen, Hai, Matthias Katzfuss, Noel Cressie, and Amy Braverman. 2014. β€œSpatio-Temporal Data Fusion for Very Large Remote Sensing Datasets.” Technometrics 56 (2): 174–85.
Nowak, W., and A. Litvinenko. 2013. β€œKriging and Spatial Design Accelerated by Orders of Magnitude: Combining Low-Rank Covariance Approximations with FFT-Techniques.” Mathematical Geosciences 45 (4): 411–35.
Nychka, Douglas, Soutir Bandyopadhyay, Dorit Hammerling, Finn Lindgren, and Stephan Sain. 2015. β€œA Multiresolution Gaussian Process Model for the Analysis of Large Spatial Datasets.” Journal of Computational and Graphical Statistics 24 (2): 579–99.
Patterson, Denis D., Simon A. Levin, A. Carla Staver, and Jonathan D. Touboul. 2020. β€œProbabilistic Foundations of Spatial Mean-Field Models in Ecology and Applications.” SIAM Journal on Applied Dynamical Systems 19 (4): 2682–2719.
Pewsey, Arthur, and Eduardo GarcΓ­a-PortuguΓ©s. 2020. β€œRecent Advances in Directional Statistics.” arXiv:2005.06889 [Stat], September.
Pollard, Dave. 2004. β€œHammersley-Clifford Theorem for Markov Random Fields.”
Possolo, Antonio. 1986. β€œEstimation of Binary Markov Random Fields.” Department of StatisticsPreprints, University of Washington, Seattle.
Rey, Sergio J., and Luc Anselin. 2010. β€œPySAL: A Python Library of Spatial Analytical Methods.” In Handbook of Applied Spatial Analysis, 175–93. Springer.
Richardson, Matthew, and Pedro Domingos. 2006. β€œMarkov Logic Networks.” Machine Learning 62 (1-2): 107–36.
Ripley, B. D. 1977. β€œModelling Spatial Patterns.” Journal of the Royal Statistical Society. Series B (Methodological) 39 (2): 172–212.
β€”β€”β€”. 1981. Spatial Statistics. Wiley.
Ripley, Brian D. 1988. Statistical inference for spatial processes. Cambridge [England]; New York: Cambridge University Press.
Rosenberg, Michael S., and Corey Devin Anderson. 2011. β€œPASSaGE: Pattern Analysis, Spatial Statistics and Geographic Exegesis. Version 2: PASSaGE.” Methods in Ecology and Evolution 2 (3): 229–32.
Saichev, A., and D. Sornette. 2006. β€œPower Law Distribution of Seismic Rates: Theory and Data.” The European Physical Journal B 49 (3): 377–401.
Saparin, Peter I, Wolfgang Gowin, JΓΌrgen Kurths, and Dieter Felsenberg. 1998. β€œQuantification of Cancellous Bone Structure Using Symbolic Dynamics and Measures of Complexity.” Physical Review E 58 (5): 6449–59.
SidΓ©n, Per. 2020. Scalable Bayesian Spatial Analysis with Gaussian Markov Random Fields. Vol. 15. LinkΓΆping Studies in Statistics. LinkΓΆping: LinkΓΆping University Electronic Press.
Stein, Michael L. 2005. β€œSpace-Time Covariance Functions.” Journal of the American Statistical Association 100 (469): 310–21.
Stein, Michael L. 2008. β€œA Modeling Approach for Large Spatial Datasets.” Journal of the Korean Statistical Society 37 (1): 3–10.
Stein, Michael L., Zhiyi Chi, and Leah J. Welty. 2004. β€œApproximating Likelihoods for Large Spatial Data Sets.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 66 (2): 275–96.
Sun, Ying, and Michael L. Stein. 2016. β€œStatistically and Computationally Efficient Estimating Equations for Large Spatial Datasets.” Journal of Computational and Graphical Statistics 25 (1): 187–208.
Whittle, P. 1954. β€œOn Stationary Processes in the Plane.” Biometrika 41 (3/4): 434–49.
Zammit-Mangion, Andrew, and Noel Cressie. 2021. β€œFRK: An R Package for Spatial and Spatio-Temporal Prediction with Large Datasets.” Journal of Statistical Software 98 (May): 1–48.
Zammit-Mangion, Andrew, Tin Lok James Ng, Quan Vu, and Maurizio Filippone. 2021. β€œDeep Compositional Spatial Models.” Journal of the American Statistical Association 0 (0): 1–22.
Zammit-Mangion, Andrew, and Jonathan Rougier. 2019. β€œMulti-Scale Process Modelling and Distributed Computation for Spatial Data,” July.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.