I’m visualising data in python because it is the lingua franca of my team. I’d like to use it for real-time and interactive or publication-quality, but I won’t be inconsolable if I cannot achieve both simultaneously.
Visualisation is not an especially strong suit of python; the strong suit is hodgepodge, decoupage, bricolage, and, uh, potpourri. Therefore my solution is to cobble something together, or better, to use someone else’s cobbling.
The default option. Well documented, ubiquitous. Reliable. Built upon a foundation of design choices aged poorly. Basic plots are easy; nuanced plots are a maze of margin tweaking, overlapping incompatible helper libraries, confusing naming and API collision. A classic example of a tool which it is easier to use by copy-pasta from stackoverflow than to use yourself. Complicated enough that I made a new notebook. See matplotlib.
A suite of visualisations tools and guides called Holoviz includes a lot of plotting infrastructure. Fresh, enthusiastic.
HoloViz tools build on the many excellent visualization tools available in the scientific python ecosystem, allowing you to access their power conveniently and efficiently. The core tools make use of Bokeh’s interactive plotting, Matplotlib’s publication-quality output, and Plotly’s interactive 3D visualizations. Panel lets you combine any of these visualizations with output from nearly any other Python plotting library, including specific support for seaborn, altair, vega, plotnine, graphviz, ggplot2, plus anything that can generate HTML, PNG, or SVG.
HoloViz tools and examples generally work with any Python standard data types (lists, dictionaries, etc.), plus Pandas or Dask DataFrames and NumPy, Xarray, or Dask arrays, including remote data from the Intake data catalog library. They also use Dask and Numba to speed up computations along with algorithms and functions from SciPy.
HoloViz tools are designed for general-purpose use, but also support some domain-specific datatypes like graphs from NetworkX and geographic data from GeoPandas and Cartopy and Iris.
Panel can be used with yt for volumetric and physics data and SymPy or LaTeX for visualizing equations.
HoloViz tools provide extensive support for Jupyter notebooks, as well as for standalone web servers and exporting as static files.
Is it good? Some think so, notably Sophia Yang, who wrote some intros, e.g.
HoloViz allows users to build Python visualization and interactive dashboard with super easy and flexible Python code. It provides the flexibility to choose among several API backends, including bokeh, matplotlib, and plotly, so you can choose different backends based on your preferences. Plus, it’s 100% open source!
Unlike the other python viz and dashboarding options, HoloViz is very serious about supporting every reasonable context in which you might want to use a Python viz or app tool:
- a Jupyter notebook,
- a Python file,
- a batch job generating PDFs or SVGs or PNGs or GIFs,
- as part of an automated report,
- as a standalone server,
- as a standalone .html file on a website.…
Each of the alternative technologies supports a few of those cases well but lets all the rest slide. HoloViz minimizes the friction and cost of switching between all of these contexts, because that’s the reality of any scientist or analyst—as soon as you publish it, people want changes! Once you have your Dash app; that’s all you have, but once you have a Panel app, you can go back to Jupyter the next day and start right where you left off.
- Panel builds interactive dashboards and apps. It’s like R Shiny, but more powerful. I can’t say enough how much I love Panel.
- hvPlot is easier than any other plotting libraries in my experience, especially if you like to plot Pandas DataFrames. With one line of code, hvPlot will provide you an interactive plot with all the nice built-in functionalities you want.
- HoloViews is a great tool for data exploration and data mining through visualization.
- GeoViews plots geographic data.
- Datashader handles big data visualization. Using Numba (Python compiler) and Dask (distributed computing), Datashader creates meaningful visualizations of large datasets very quickly. I absolutely love Datashader and love the beautiful plots it generates.
- Param creates declarative user-configurable objects.
- Colorcet creates colormaps.
Originally a mostly browser-based visualisation, plotly’s native python support (source) is supposed to be quite good and quite general these days since you can embred browser tech in other things easily. It support high-resolution print-quality graphics, vector rendering and so on. Certainly the Plotly library is hipper than matplotlib and seems to incorporate the input of some graphic designers from the internet, which matplotlib seems to do rarely because it is old and/or confusing and/or unlikely to pop up as a highlight in your web portfolio since the main target is scientific journals.
bokeh does “big-data” and streaming-based browser graphing for python. Its website probably looks the nicest out of everything I’ve mentioned, which says important about priorities. However, its print-output seems to bad; this is a web-oriented tool.
Misc browser options
Here are some hacks:
superset is Airbnb’s python+browser interactive data exploration tool; filed under dashboards.
The mpld3 package is extremely easy to use: you can simply take any script generating a matplotlib plot, run it through one of mpld3’s convenience routines, and embed the result in a web page.
2d only, AFAICT.
GR is a universal framework for cross-platform visualization applications. It offers developers a compact, portable and consistent graphics library for their programs. Applications range from publication quality 2D graphs to the representation of complex 3D scenes. […]
GR is essentially based on an implementation of a Graphical Kernel System (GKS) and OpenGL. […] GR is characterized by its high interoperability and can be used with modern web technologies and mobile devices. The GR framework is especially suitable for real-time environments.
It will also
function as a matplotlib backend.
ugly brutalist in its graph presentation, but it works fine.
A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.
It pumps graphs to a visualisation server enabling some kind of shared visualisation of a thing of interest. You want more of a pitch?
- Visdom aims to facilitate visualization of (remote) data with an emphasis on supporting scientific experimentation.
- Broadcast visualizations of plots, images, and text for yourself and your collaborators.
- Organize your visualization space programmatically or through the UI to create dashboards for live data, inspect results of experiments, or debug experimental code.
It sounds like one could e.g. build an SGD diagnostic convergence diagram using this as an alternative to Tensorboard.
Plotting networks in particular
- INRIA’s Tulip has fans
- Visualizing a NetworkX graph in the IPython notebook with d3.js
VisPy is OpenGL-backed data visualisation, focussing on science (ooh!). It also offers a matplotlib compatibility layer. Here are some howtos:
It seems to require more writing of OpenGL shaders than one would like to draw a line graph.
However, there are less messy looking tools in the ecosystem: napari is a multidimensional image viewer leveraging Vispy.
On a similar tip, although looking more basic and more bitrotten, is vtk - if I understand correctly, VTK is the engine used by Mayavi? Better maintained and possibly still vtk-based is Paraview, which supports pluggable backends.
Not exactly graphing libraries
Disney (!) has a game library Panda3d, that seems to do all the fun things
even more bareback, more-or-less-directly calling into openGL, but seriously, I’m a statistician, not a coder. I could also hand-pulp hemp to make my own graph paper to draw my visualisations, drawn in home-made iron gall ink, but I would find it equally hard to argue that it was an efficient prioritisation.
Bayes in particular
ArviZ is a Python package for exploratory analysis of Bayesian models. Includes functions for posterior analysis, data storage, sample diagnostics, model checking, and comparison.
General image reading and writing
- Imageio is a workhorse python image system.