Plotting stuff in julia

Julia has an assortment of adequate, if unexciting, plotting options. The julia wiki includes worked examples using various engines.

A feature of many julia toolkits is that they produce SVG graphics per default, which easily become gigantic for even medium-large data sets and start flogging your CPU/RAM real hard. This is ok for printing, or for small data. For exploratory data analysis I disable SVG or “interactive” JS stuff in favour of some rasterised format like PNG.

Plots.jl

The default option/family-of-options. Plots.jl wraps many other plotting packages as backends, although notably not Gadfly. The author explains this is because Gadfly does not support 3d or interactivity, although since I want neither of those things in general, especially for publication, this is a contraindication for the compatibility of our mutual priorities. I have had enough grief trying to persuade various mathematics department that this whole “internet” thing is anything other than a PDF download engine; I don’t need to compromise my print output for animation, of all the useless fripperies. We tell you, this “multimedia” thing is a fad, and the punters will be going back to quill and vellum soon enough.

Anyway, one can absolutely shoehorn Plots.jl into print-quality CMYK plots as well as web stuff, so please disregard my grumping.

The Plots documentation is not fantastic, since it’s notionally simply wrapping some other plot libraries, and should defer to them. Except of course they all have their own terminology and APIs and what-have-you, so the whole system is a confusing Kafka-esque maze of buck-passing. You can more-or-less work it out by perusing the attributes documentation, then checking specific backend examples, e.g. GR.

The whole interface is functional, although inflexible and inelegant in comparison with the state-of-the-art. They have adopted the MATLAB/Pyplot style of procedural plotting which was great for the 80s but has been substantially improved by modern Grammar-of-Graphics style approaches as seen in ggplot, which is still the killer app for R IMO. However most of us learn to do without that so this is not a showstopper. Also, there is a ggplot-style alternative listed below. Or you could export your data into R for the final plots.

Plots has a rich extensions ecosystem. PlotRecipes and StatPlots use the “Recipes” system defined in RecipesBase) to provide a macro(?)-based data-specific plotting tools. For example, StatPlots causes Plots to have sensible DataFrame support.

Table-like data structures, are supported thanks to the macro @df which allows passing columns as symbols.

using StatPlots
using DataFrames, IndexedTables
gr(size=(400,300))
df = DataFrame(a = 1:10, b = 10 .* rand(10), c = 10 .* rand(10))
@df df plot(:a, [:b :c], colour = [:red :blue])

Now, some backends.

GR

My default `Plots.jl`` backend is GR. This wraps GR.jl which in turn wraps GR, a cross-platform visualisation framework:

GR is a universal framework for cross-platform visualization applications. It offers developers a compact, portable and consistent graphics library for their programs. Applications range from publication quality 2D graphs to the representation of complex 3D scenes.

GR is essentially based on an implementation of a Graphical Kernel System (GKS) and OpenGL. The GR framework is especially suitable for real-time environments.

Anyway, here is one important tip: if you aren’t rendering graphs for publication output, but for say exploratory data analysis, switch to PNG from SVG because SVG is very large for images with lots of details.

ENV["GKS_ENCODING"] = "UTF-8"
using Plots: gr, default
gr()
default(fmt=:png)

Note that while I was there I set the character encoding because otherwise GR has weak character support. There are other workarounds such as inbuilt greek characters and LaTeX:

using LaTeXStrings, Plots
plot(
    rand(20),
    lab=L"a = \beta = c",
    title=L"before\;f(\theta)\;and\;after")

If you want to suppress latex rendering, ensure that your label string does not both start and end with $. I do this by padding with a trailing space.

If anything is flaky on first execution, you might need to check the GR installation instructions which includes such steps (on Ubuntu) as:

apt install libxt6 libxrender1 libxext6 \
    libgl1-mesa-glx libqt5widgets5

The observant will notice that this requires root access on the machine. I’m sure there must be a workaround but I can’t be arsed discovering it right now.

Also, every time the version of GR/ GR.jl increments there is a loooong recompilation process, which seems to be single-threaded and takes many minutes on my fancy 8-core machine. So be aware that it is not fast in every sense.

For all its smoothness and speed when it is up and running, GR Plots are not IMO all that beautiful and it is not clear how to make them beautiful, since beauty is hidden down at the toolkit layer. There is some deep metaphor here.

Plotly

Plots.jl also targets Javascript online back ends via Plotly, which is neat although I have no use for it at the moment. As mentioned previously, in my department this “online” nonsense is about as popular as communicating data through modulated flatulence, on the basis that the two are approximately equivalent in terms of their contributions toward our performance metrics.

InspectDR

InspectDR functions as a Plots.jl backend but I think is more designed to be run separately. It produces interaction-oriented plots with inclination toward signal processing.

Gadfly

The aspirational ggplot clone is Gadfly. It’s elegant, pretty, well integrated into the statistics DataFrame ecosystem, but missing some stuff, and has weird gotchas.

I had to switch Gadfly from SVG to PNG or similar rasterised format, as presaged, to avoid bloated jupyter notebooks. First I needed to install the Fontconfig and Cairo bonus libraries

Pkg.add("Gadfly")
Pkg.add("Cairo")
Pkg.add("Fontconfig")

Now to force PNG output:

draw(
    PNG(150, 150),
    plot(data,
        x=:gamma,
        y=:p,
        Geom.point, Geom.line)
)

Even though it feels most natural to me, I am not currently using Gadfly. The reason being: Gadfly seemed unfortunately slow on initial startup; This is survivable, and there are workarounds. But even after startup it seemed slow on many basic tasks. If I histogram a million data points it took 30 seconds for each plot of that histogram (i.e. slower than python’s pyplot). It saps the utility of an elegant graph grammar if it’s not responsive to your adjustments. I wonder if this can be improved?

Gadfly is based on a declarative vector graphics system called Compose.jl, which might be independently useful.

Makie

Makie is an OpenGL-backed visualisation library so without investigating further I would presume it does great on screen-quality 3d possibly at the expense of print-quality 2d. Haven’t tried it, since learning the Plots.jl and Gadfly.jl APIs has filled up my brain, but the gallery is pretty.

Edited highlights of its justification for existing:

Makie is a high level plotting interface for GLVisualize, with a focus on interactivity and speed.

It can also be seen as a prototype for a new design of Plots.jl, since it will implement a very similar interface and incorporate a lot of the ideas.

A fresh start instead of the already available GLVisualize backend for Plots.jl was needed for the following reasons:

  1. Plots.jl was written to create static plots without any interaction. This is deeply reflected in the internal design and makes it hard to integrate the high performance interaction possibilities from GLVisualize.
  2. Plots.jl has many high level plotting packages as a backend [and] there is no straight interface a backend needs to implement, which lead to a lot of duplicated work for the lower level backends and a lot of inconsistent behavior since the code isn’t shared between backends.
  3. …There should be a finite set of “atomic” drawing operations (which can’t be decomposed further) which a backend needs to implement and the rest should be implemented via recipes using those atomic operations.
  4. Backend loading is done in Plots.jl via evaling the backend code. This has negative consequences:

    1. Backend code can’t be precompiled leading to longer load times
    2. Backend dependencies are not in the Plots.jl REQUIRE file…

That sounds promising for the future, eh?

Vega

Vega.jl binds to a javascript ggplot-alike, vega. There is a “higher level” binding also called VegaLite.jl.

It’s unclear to me how either of these work with print graphics, and they are both lagging behind the latest version of the underlying Vega library, so I’m wary of them.

Others

Winston.jl has a brutalist simple approach but seems to go.