Spatial data in R



On using R as a geographic information system.

I am indebted to Ross Darnell and Petra Kuhnert for invaluable tips that got my understanding started.

Theory of spatial statistics

Not R-specific? See spatial statistics or spatiotemporal statistics.

How to

Robin Lovelace, Jakub Nowosad and Jannes Muenchow’s free textbook, Geocomputation with R, seems to be the most up-to-date resource. Slightly more recently updated but with less promotion, Edzer Pebesma and Roger Bivand’s Spatial Data Science seems like it might be good. Manuel Gimond’s lecture notes are good: Intro to GIS and Spatial Analysis.

Geographical encodings

For theory, see spatial statistics. Technical details go here.

There are some standards to be aware of when it comes to encoding geographical coordinates. The classic R format for spatial data information is sp, which is some kind of convention for encoding spatial data in something similar to, but not totally like, a dataframe. Much spatial code in R uses sp-style access.

More recently, sf has become the hotness. This is several things. Firstly there is a standard (Simple features / simple feature access) which is nothing to do with R. This has a formal ISO standard number and reams of documentation about how computers can handle the spatial geometry and informatics. This standard now underlies ESRI ArcGIS), GDAL and the GeoJSON standard.

The sf is also an R package which plugs various technology into all those systems via the sf standard, and aims at superseding sp in the long term. Its data storage seems to be more dataframe-like.

For now, you need to know both standards exist and convert between them accordingly.

Conversion is apparently simple according to the sf manual

as_Spatial() allows to convert sf and sfc to Spatial*DataFrame and Spatial* for sp compatibility. You can also use as(x, "Spatial") To transform sp objects to sf and sfc with as(x, "sf").

Did you follow that? Pitfalls of working with each of them are summarised at Spatial data and the tidyverse: pitfalls to avoid.

More info is at

Maps

R will happily plot a map if you give it some mapping polygons. However, often you want something with more of the affordances of modern interactive web maps. Fancy map tiles with streets and terrain and such are available from map providers such as google maps or stamen. But how to use them in R? Many options.

tmap

tmap (“thematic maps”) is one option (manual). It aims to be ggplot-like. It supports a leaflet backend, as does the leaflet library, below, although I think through separate codebases.

ggmap

ggmap (see quickstart) gets some of the bits and pieces from google maps via mapsapi. rgooglemaps is another alternative, I think. See Laura Ellis’s excellent tutorial, Kahle and Wickham’s tutorial.

leafletR

leafletR interfaces with leaflet.js. It is independently useful to have also because it has a GeoJSON export function, toGeoJSON which is small, light and fast via the ogre service. For complicated or big projects it falls back to rgdal.

plotly maps

Plotly, does not specialise in maps but if you needs skew more towards the data and less towards the geography, its mapping facilities are adequate, and it has native R integration. See, for example the R bubble maps example.

geojson output

Or, don’t generate the javascript map in R. I can also process data in R then export it as GeoJSON for use in a javascript app/website.

Here are the commands for that, firstly old-skool GDAL style:

library("rgdal")
data(meuse)
coordinates(meuse) = c("x", "y")
writeOGR(meuse, dsn="path/test_geojson.GeoJSON", layer="meuse", driver="GeoJSON")

then hipper sf style:

library(sf)
meuse_sf = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
st_write(meuse, "meuse.geojson")

sugarbag: hip hexagon visualisation

Monash NUMBATs explain some tweaks: Hexmaps with sugarbag make it easier to see the electoral map. Here are some manual pages for that:

The examples are ggplot2-backed, which is nice for looking at but not nice for browser interaction. I wonder how the polygons can be extracted?

luscious 3d renderings

rayshader is an open source package for producing 2D and 3D data visualizations in R. rayshader uses elevation data in a base R matrix and a combination of raytracing, spherical texture mapping, overlays, and ambient occlusion to generate beautiful topographic 2D and 3D maps. In addition to maps, rayshader also allows the user to translate ggplot2 objects into beautiful 3D data visualizations.

(Thanks D.calf-e for showing me this.)

GDAL library dependencies

rgdal is an interface to the classic geodata infrastructure GDAL.

rgdal is notoriously fiddly to install and has many system dependencies, but it can do everything, I think. On an ubuntu system, I needed to install devtools and get the packages installed by forcing the tippy-top latest version to install:

install.packages("rgeos", repos="http://R-Forge.R-project.org", type="source")
install.packages("rgdal", repos="http://R-Forge.R-project.org", type="source")
install.packages("devtools")
devtools::install_github("r-spatial/sf")

Incoming

References

Baddeley, Adrian, Ege Rubak, and Rolf Turner. 2016. Spatial Point Patterns: Methodology and Applications with R. Champan & Hall/CRC Interdisciplinary Statistics Series. Boca Raton ; London ; New York: CRC Press, Taylor & Francis Group.
Bivand, Roger, Edzer J. Pebesma, and Virgilio Gómez-Rubio. 2013. Applied Spatial Data Analysis with R. Second edition. Use R! New York: Springer.
Finley, Andrew O., Sudipto Banerjee, and Bradley P. Carlin. 2007. spBayes: An R Package for Univariate and Multivariate Hierarchical Point-Referenced Spatial Models.” Journal of Statistical Software 19 (April): 1–24.
Finley, Andrew O., Sudipto Banerjee, and Alan E. Gelfand. 2015. spBayes for Large Univariate and Multivariate Point-Referenced Spatio-Temporal Data Models.” Journal of Statistical Software 63 (February): 1–28.
Lindgren, Finn, and Håvard Rue. 2015. Bayesian Spatial Modelling with R-INLA.” Journal of Statistical Software 63 (i19): 1–25.
Lovelace, Robin, Jakub Nowosad, and Jannes Münchow. 2019. Geocomputation with R. Boca Raton: Taylor & Francis.
Scheider, Simon, Benedikt Gräler, Edzer Pebesma, and Christoph Stasch. 2016. Modeling Spatiotemporal Information Generation.” International Journal of Geographical Information Science 30 (10): 1980–2008.
Stasch, Christoph, Simon Scheider, Edzer Pebesma, and Werner Kuhn. 2014. Meaningful Spatial Prediction and Aggregation.” Environmental Modelling & Software 51 (January): 149–65.
Zammit-Mangion, Andrew, Michael Bertolacci, Jenny Fisher, Ann Stavert, Matthew L. Rigby, Yi Cao, and Noel Cressie. 2021. WOMBAT v1.0: A fully Bayesian global flux-inversion framework.” Geoscientific Model Development Discussions, July, 1–51.
Zammit-Mangion, Andrew, and Noel Cressie. 2021. FRK: An R Package for Spatial and Spatio-Temporal Prediction with Large Datasets.” Journal of Statistical Software 98 (May): 1–48.
Zammit-Mangion, Andrew, and Jonathan Rougier. 2019. Multi-Scale Process Modelling and Distributed Computation for Spatial Data,” July.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.