GUIs for numerical array data

March 4, 2015 — May 19, 2022

Often I want my data not just as lists of numbers but as something easier for my monkey brain to interpret.

For now this page covers only HDF5 viewers. Viewers for relational databases full of data are elsewhere. I’ve also split off the tools that specialize in exploratory analytics and plotting under data dashboards.

Figure 1

1 HDF5 viewers

The big-data storage format is conceptually a special weird database for arrays of numbers. Conceptually it would be easy to make a nice viewer for these files. In practice all the major players are annoying for various reasons and I usually write python scripts to visualise the data I need.

1.1 H5web

silx-kit/h5web: Web-based HDF5 file viewer is a javascript app viewer that seems to be the lightest and claims to be standalone. The demo looks great. For the life of me, I cannot work out how to install and run it.

1.2 HDFView

HDF® View is some kind of Java viewer that you can download from the HDF Group after registering. They do not have a normal source repository, and the entire project has a faintly depressing feeling of clunkiness but I am sure it is fine, maybe. Manual: hdfview. Notable flaw: the “open” button is broken per-default, and will attempt to load the entire file into memory, which defeats the purpose that most people have for using HDF5. However,. there is an open as button which allows subsetting.

1.3 Panoply

As seen at spatial dataviz, NASA GISS: Panoply 3 is a simple viewer for certain popular scientific data formats, netCDF, HDF and GRIB Data. It is simple but has arbitrary quirks, e.g. there are some combinations of axes that you cannot view simultaneously.

1.4 vitables

vitables is a python hdf5 gui. I have not used it because installation was a pain last time I tried, but that was a long ime ago and development has continued since then. Maybe it is OK?

pip install ViTables

1.5 jupyterlab-hdf5

jupyterlab/jupyterlab-hdf5:

Open and explore HDF5 files in JupyterLab. Can handle very large (TB) sized files, and datasets of any dimensionality.

Based on, I think, H5web. Sounds great, but does not support recent jupyterlab versions, and I have experience-based reasons to regard jupyterlab-as-a-GUI to be a way of adding extra failure points to an app which would be better without jupyter.

1.6 plain python script

import h5py
import matplotlib.pyplot as plt

data_f = h5py.File('myfile.h5', 'r')

arr = data_f['test']['a']

columns = 5
rows = 4
fsize = 6
fig = plt.figure(figsize=(fsize *columns/rows, fsize))

for i in range(0, columns*rows):
    img = arr[i]
    ax = fig.add_subplot(rows, columns, i+1)
    plt.imshow(img)
    ax.set_axis_off()
plt.tight_layout(pad = 1)
plt.show()