Matplotlib

A way to draw things in python, which is better than no way to draw things in python



.

The classic python plotting tool is matplotlib. It can’t do all those modern hipster graphs without hard labour and is awful at animations and interactions, and it fugly per default. It works OK out of the box. There are libraries which use matplotlib as a backend and build more elaborate systems on the top, but these have not had much longevity so far, so I find myself falling back to plain old matplotlib. It is an acceptable default with lots of weird edge cases when you try to be clever, but gets the job 80% done.

Note some confusing terminology; An Axes object, which is constructed by an add_subplot command, contains two Axis objects, but is much more than a list of such objects, being the fundamental object upon which a graph is drawn.

But don’t listen to me describe it. Observe this lovely diagram which explains all.

.

Taxonomy of a matplotlib plot. I’ve lost the source for this image, sorry.

Read Jakevdp’s manual for some pedagogic advice.

Pro tips

Showing images

Traditionally annoying. There are colorbars everywhere and aspect ratios are horrible and getting multiple images to plot is vexing etc.

There are helpers for this in modern matplotlib. The keyword to look for is mpl_toolkits.axes_grid1. See also the tutorial.

Alternative, use proplot

Wrappers with better APIs

matplotlib will always be horrible to ue because the designers were not prophets so some of their early API seigdesignn guesses were bad, but we are stuck with them for compatibility. There are various attempts to abstract away matplotlib's horrilbe API behind nicer ones.

Proplot

Why proplot?

Matplotlib is an extremely versatile plotting package used by scientists and engineers far and wide. However, matplotlib can be cumbersome or repetitive for users who…

  • Make highly complex figures with many subplots.

  • Want to finely tune their annotations and aesthetics.

  • Need to make new figures nearly every day.

Proplot’s core mission is to provide a smoother plotting experience for matplotlib’s most demanding users. We accomplish this by expanding upon matplotlib’s object-oriented interface. Proplot makes changes that would be hard to justify or difficult to incorporate into matplotlib itself, owing to differing design choices and backwards compatibility considerations.

This page enumerates these changes and explains how they address the limitations of matplotlib’s default interface. To start using these features, see the usage introduction and the user guide.

seaborn-ng

The Next-generation seaborn interface attempts to achieve a pythonic equivalent to ggplot, at least somewhat:

as seaborn has become more powerful, one has to write increasing amounts of matpotlib code to recreate what it is doing.

So the goal is to expose seaborn’s core features — integration with pandas, automatic mapping between data and graphics, statistical transformations — within an interface that is more compositional, extensible, and comprehensive.

One will note that the result looks a bit (a lot?) like ggplot. That’s not unintentional, but the goal is also not to “port ggplot2 to Python”. (If that’s what you’re looking for, check out the very nice plotnine package). There is an immense amount of wisdom in the grammar of graphics and in its particular implementation as ggplot2. But I think that, as languages, R and Python are just too different for idioms from one to feel natural when translated literally into the other. So while I have taken much inspiration from ggplot, I’ve also made plenty of choices differently, for better or for worse.

Note that as exciting as this sounds, the project is 100% vaporware at this stage, with no sign of a public release or any commitment to any kind of process

I do plan to issue a series of alpha/beta releases so that people can play around with it and give feedback, but it’s not at that point yet.

Possibly one can follow along at mwaskom/seaborn/nextgen/main.

plotnine

Plotnine implements a best-effort clone of R’s ggplot2 library for matplotlib I believe plotnine supersedes the abandoned(?) ggplot.py by yhat (ggplot source, plotnine source).

Matplotlib styling

The default matplotlib stylesheet aspires to look like 80s spreadsheet defaults, but if you are not a retrofuturist, you want to change the stylesheet Some of the built-in stylesheets are OK.

Here is an ugly gallery of sometimes-beautiful graph styles. And here is an ugly gallery of sometimes-beautiful colour maps.

Seaborn is another vaunted extension, which I would describe as an “Edward Tufterizer”. Extends matplotlib with modern appearance and some missing plot types.

TUEplots

TUEplots is a light-weight matplotlib extension that adapts your figure sizes to formats more suitable for scientific publications. It produces configurations that are compatible with matplotlib’s rcParams, and provides fonts, figure sizes, font sizes, color schemes, and more, for a number of publication formats.

A cute hack to justify matplotlib’s existence: xkcd graphs.

Margins on saved graphics are too large

plt.savefig("image.png", dpi=300, bbox_inches='tight', pad_inches=0)

Axis labels

Suppressing?

ax = plt.gca()
ax.axes.xaxis.set_visible(False)
ax.axes.yaxis.set_visible(False)

Glue

An interactive exploratory matplotib GUI toolkit/app is glue. They have solved a lot of python gui problems, bless them, and have tried to make everything more-or-less interactive.

Glue is designed with "data-hacking" workflows in mind, and can be used in different ways. For instance, you can simply make use of the graphical Glue application as is, and never type a line of code. However, you can also interact with Glue via Python in different ways:

  • Using the IPython terminal built-in to the Glue application
  • Sending data in the form of NumPy arrays or Pandas DataFrames to Glue for exploration from a Python or IPython session.
  • Customizing/hacking your Glue setup using config.py files, including automatically loading and clean data before starting Glue, writing custom functions to parse files in your favorite file format, writing custom functions to link datasets, or creating your own data viewers.

Glue thus blurs the boundary between GUI-centric and code-centric data exploration. In addition, it is also possible to develop your own plugin packages for Glue that you can distribute to users separately, and you can also make use of the Glue framework in your own application to provide data linking capabilities.

Image montage

I have an array of images in arr. How can I plot them on a nice simple plot? I need to do this all the time. If I have skimage installed I can use the montage function. I do not always have that installed though. Here is a snippet to do it by hand:

columns = 5
rows = 3
fsize = 6
fig = plt.figure(figsize=(fsize *columns/rows, fsize))

for i in range(1, columns*rows +1):
    img = arr[1,:,:,i]
    ax = fig.add_subplot(rows, columns, i)
    plt.imshow(img)
    ax.set_axis_off()
plt.tight_layout(pad = 1)
plt.show()

Alternative, use proplot

Math-friendly fonts

tikzplotlib

Agustinus Kristiadi, in The Last Mile of Creating Publication-Ready Plots introduces texworld/tikzplotlib,, which is a tikz plotting backend; why do we want this? For one, it can match fonts to the parent document.

Basic font stuff

Someone made the idiosyncratic choice that default font is sans serif, even for mathematical text. You can change this by setting serif fonts also for mathtext.

from matplotlib import rc
rc(
  'font',
  family='serif',
  serif=['Palatino']
)
rc(
  'mathtext',
  fontset='cm'
)

Supported math fonts are reputedly

  • dejavusans (the horrible default)
  • dejavuserif (beware of odd greek letters)
  • cm (“Computer Modern”. Classic, dated.)
  • stix (modern serif, looks OK)
  • stixsans (sounds like sans serif to me)

Alternatively I can render graph labels with TeX which leads to some weird spacing but allows me to match fonts better. It is also fragile and character set issues are terrible. Are these problems eased if I use XeLaTeX/LuaLaTeX?

I am indebted to my colleague Christian Walder for suggesting this as a reliable initialisation procedure for matplotlib plotting.

import matplotlib
import sys, os

# GTK GTKAgg GTKCairo MacOSX Qt4Agg TkAgg WX WXAgg CocoaAgg
# GTK3Cairo GTK3Agg WebAgg agg cairo emf gdk pdf pgf ps svg template

is_mac = sys.platform == 'darwin'
if is_mac:
    _matplotlib_backend = 'MacOSX'
else:
    _matplotlib_backend = 'pdf'

matplotlib.rcParams['svg.fonttype'] = 'none'
matplotlib.rcParams['backend'] = _matplotlib_backend
matplotlib.rcParams['mathtext.fontset'] = 'stix'
matplotlib.rcParams['font.family'] = 'Times New Roman'

matplotlib.use(_matplotlib_backend)
import matplotlib.pyplot as plt
import matplotlib as mpl

plt.switch_backend(_matplotlib_backend)
# print(matplotlib.pyplot.get_backend())
try:
    import cairocffi as cairo
except:
    pass
    # logging.warning('import cairocffi failed')

_latex_preamble = [
    r'\usepackage{amsmath,bm}',
    r'\newcommand\what{\hat{\bm{w}}}',
    r'\newcommand\tr{^\top}',
    r'\newcommand\dt[1]{\left|#1\right|}',]

_latex_path = '/Library/TeX/texbin/'


def use_latex_mpl(
        latex_path=_latex_path,
        latex_preamble=_latex_preamble):
    mpl.rcParams['text.usetex'] = True
    mpl.rcParams['text.latex.preamble'] = latex_preamble
    if latex_path is not None:
        os.environ['PATH'] = '%s:%s' % (os.environ['PATH'], latex_path)

Yellowbrick

Yellowbrick is a matplotlib specialisation for hyperparameter optimisation.

Yellowbrick extends the Scikit-Learn API to make model selection and hyperparameter tuning easier. Under the hood, it’s using Matplotlib.


No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.