(Geo)spatial data sets

In which I complain about paying a nominal fee for giant rocket robots that scan the earth from space

Satellite images, geological tomography, climate and data records, miscellaneous useful data points about our globe

Maps and satellite photos

Here is a review of satellite image sources. I have only checked out a handful of these. If you just want eye candy, NASA Visible Earth is a good one. I’m fond of LANDSAT maps. Various can be found through Earth Explorer. All these resources blur into one after a while, with similarly confusing interfaces, and unexpected UI glitches and apparently random surprise pricing structures revealed belatedly.


openEO develops an open API to connect R, Python, JavaScript and other clients to big Earth observation cloud back-ends in a simple and unified way.

Earth Observation data are becoming too large to be downloaded locally for analysis. Also, the way they are organised (as tiles, or granules: files containing the imagery for a small part of the Earth and a single observation date) makes it unnecessary complicated to analyse them. The solution to this is to store these data in the cloud, on compute back-ends, process them there, and browse the results or download resulting figures or numbers. But how do we do that?

openEO develops an open application programming interface (API) that connects clients like R, Python and JavaScript to big Earth observation cloud back-ends in a simple and unified way.

earthengine.google.com/ provides lots of imagery with an eye to discoverability and UX.

The public data archive includes more than thirty years of historical imagery and scientific datasets, updated and expanded daily. It contains over twenty petabytes of geospatial data instantly available for analysis.

See also Australia-specific stuff.


CHIRPS: Rainfall Estimates from Rain Gauge and Satellite Observations

pangeo is an umbrella organisation providing many geospatial data tools including a catalogue of hydrological, oceanographic and suchlike.

from intake import open_catalog

cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/master.yaml")

Open Data Cube is a whole python library for working with satellite images and other large scale raster data.

Extreme Weather Dataset Racah et al. (2017) includes for each year a (1460,16,768,1152) array, containing

  • 1460 example images (4 per day, 365 days in the year)
  • 16 channels in each image corresponding to various weather-related quantities
  • each channel is 768 x 1152 corresponding to one measurement per 25 square km on earth



“This webpage provides an interactive and searchable catalogue of public benchmark datasets for earth observation with the aim to support researchers in the fields of geoscience, remote sensing, and ML.“


Camps-Valls, Gustau, Manuel Campos-Taberner, Álvaro Moreno-Martínez, Sophia Walther, Grégory Duveiller, Alessandro Cescatti, Miguel D. Mahecha, et al. 2021. A Unified Vegetation Index for Quantifying the Terrestrial Biosphere.” Science Advances 7 (9): eabc7447.
Pomerleau, Dean A. 1989. ALVINN: An Autonomous Land Vehicle in a Neural Network.” In Advances in Neural Information Processing Systems, edited by D. Touretzky. Vol. 1. Morgan-Kaufmann.
Racah, Evan, Christopher Beckham, Tegan Maharaj, Samira Ebrahimi Kahou, Mr. Prabhat, and Chris Pal. 2017. ExtremeWeather: A Large-Scale Climate Dataset for Semi-Supervised Detection, Localization, and Understanding of Extreme Weather Events.” In Advances in Neural Information Processing Systems, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett. Vol. 30. Curran Associates, Inc.
Roberts, Dale, John Wilford, and Omar Ghattas. 2019. Exposed Soil and Mineral Map of the Australian Continent Revealing the Land at Its Barest.” Nature Communications 10 (1): 5297.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.