Julia, the programming language
The hippest way to get your IEEE754 on. Hngh.
April 1, 2015 — July 19, 2022
Assumed audience:
From level 0 (julia-curious) up to level 2 (how do you overload broadcasting?)
Julia: A JIT-compiled language with emphasis on the affordances for high performance scientific computation. Which is to say, it is made for people who want to make new algorithms.
I use julia enough that I have made many notes about it, splitting off new notebooks covering installation, processing arrays, tensors, matrices, debugging, profiling and accelerating, IDEs and workflows, APIs, FFIs and IO, autodiff, plotting, machine learning, and UIs.
1 Why julia
tl;dr Not a magic bullet, a handy arrow for the quiver.
Some of Julia’s community makes ambitious claims about Julia being the fastest and bestest thing ever.
Unsurprisingly, Julia is no panacea. It is well designed for numerical computation. In my non-rigorous experiments it seems to do better than other scripting+compilation options, such as cython, on certain tasks. In particular, if you have dynamically defined code in the inner loop it does well — say, you are doing a Monte Carlo simulation, but don’t know the users’s desired density ahead of time. This is more or less what you’d expect from doing compilation as late as possible rather than shipping a compiled library, but it (at least for my use case) involves less messing around with compilation tool-chains, platform libraries, ABIs, makefiles etc.
Julia has its own idiosyncratic frictions. (I would like to supplement that previous image with a learning curve.) The community process can be problematic (see also giving up on Julia). Library support is patchy with less mindshare than python. It doesn’t run on iOs. In fact it uses lots of memory, so maybe not ideal for embedded controllers. Although my colleague Rowan assures me he runs serious Julia code on Raspberry Pi at least all the time, so maybe I need to tighten up my algorithms.
That said, the idea of a science-users-first JIT language is timely, and Julia is that. Python, for example, has clunky legacy issues in the numeric code and a patchy API, and is ill-designed for JIT-compilation, despite various projects that attempt to facilitate it (although jax is as good as it can be probably). Matlab is expensive, and nasty for non-numerics, not to mention old and crufty, and code seems to constantly break between MATLAB versions at least as often as it does between Julia versions. Lua has a few good science libraries and could likely have filled a similar niche but for reasons that are perhaps sociological as much as technical doesn’t have the hipness or critical mass of Julia. Super hipsters think that julia is not radical enough and like DEX instead, which is to julia as julia is to everything else.
2 Documentation
2.1 Intros
In order of increasing depth
- Julia by example is all you need to go, if you have other programming language experience.
- Bogumił Kamiński, The Julia Express
- Boyd and Vandenberghe’s Julia Companion to their Introduction to Applied Linear Algebra is a solid introduction to both linear algebra and Julia, focussing especially on least-squares problems.
- Yoni Nazarathy and Hayden Klok, Statistics with Julia: Fundamentals for Data Science, Machine Learning and Artificial Intelligence (preprint) (Nazarathy and Klok 2021) is a good introduction to statistics, with a focus on the Julia language.
- Introducing Julia has the unfortunate leaden committee prose style of most wikibooks, but for sure will get you educated.
- Official documentation is fine but pedagogically arse-backwards, as official docs tend to be.
Here’s something that wasn’t obvious to me: What are Symbols?
And here is a neat heuristic:
A Mental Model for Julia: Talking to a Scientist
- When you’re talking, everything looks general. However, you really mean very specific details determined by context.
- You can quickly dig deep into a subject, assuming many rules, theories, and terminology.
- Nothing is hidden: if you ever want to hear about every little detail, you can ask.
- They will get mad (and throw errors at you) if you begin to be loose with the specific details.
—Chris Rackauckas, Intro to Julia.
2.2 Staying current
Julia is a fast-moving target. One way to catch the hype in the confusingly rapid evolution of the package ecosystem is to check out the package hotness in julia observer.
3 Typing and dispatch
3.1 Keyword arguments
Keyword arguments exist but do not participate in method dispatch. Keyword arguments are second-class citizens and might make things slow or stupid if you need to specialise your code based on them. So… design your functions around that. This is usually OK with careful thought, but small lapses leads to many irritating helper functions to handle default arguments. There is much idiomatic style to learn for this.
3.2 Traits and dispatch
Dispatching is not always obvious.
I tend to forget an important keyword: Value Types, which are what allows one to choose a method based on the value, rather than type of a thing. Their usage was not obvious (to me) from the manual but explained beautifully by Tim Holy. There are dangers.
Chris Rackaukas explain Holy Traits which are a julia idiomatic duck typing method, explained in greater depth by Mauro3 as part of his discontinued Trails library. See also Lyndon White.
🏗
4 Pretty printing, formatting
4.1 Strings: care and feeding
There are many formatting libraries, because everyone seems to dislike the built-in option, which eschews functions such as sensible user-defined formatting of floats.
Many alternatives seems to be based on Formatting.jl which apes python string interpolation and this is IMO a reasonable baseline.
There is a friendly-ish fork, Format.jl which has a slightly different philosophy and provides an alternative string syntax, StringLiterals .
Mustache.jl also gained traction, being a generic templating syntax; it is easy to roll your own formatting from this.
4.2 Rich display
Julia has an elaborate display system for types, as illustrated by Simon Dernisch and Henry Shurkus.
tl;dr To display an object, invoke
Say you defined MyType
. To ensure MyType
displays sanely, define
e.g.
Latexify (manual) marks up certain Julia objects nicely for markdown or TeX display.
PrettyTables aims to output ASCII tables, and happens to support LaTeX, HTML and various Markdown flavours.
TexTables.jl is a specialised table renderer for scientific table output with easy interface. It seems a little less active/popular.
More specialised, RegressionTables does statistical tables with baked-in analysis and a side-order of formatting. That feels too tightly coupled to me. Its launch announcement is a tour of the statistical ecosystem.
Matti Pastell’s useful reproducible document rendering system, Weave.jl
, supports basic table displaying for Latex/HTML, although they recommend handballing it to Latexify
.
Note that actually automatically generating table output is still more tedious than you’d like; for example, we should be using scientific notation. This generally requires writing custom formatters. Leandro Msrtinez walks us through that, or one can use ft_latex_sn
the built-in PrettyTables helper.
5 Approximating and interpolating functions
ApproxFun.jl does Chebychev and Fourier approximations of given functions This is not, at least primarily, a tool for data analysis, but for solving eigenfunction problems and such like using computational Hilbert space methods for functions which are otherwise difficult. In particular, we must be able to evaluate the target functions at arbitrary points to construct the interpolant, rather than at, say, provided sample points. Useful companion package FastTransforms converts between different basis function representations. Interpolations.jl does arbitrary order spline interpolations of mathematical functions, but also data. This enables some clever tricks, e.g. approximate random sampling of tricky distributions.
6 Units
People sometimes assert that Julia can handle physical units in a syntactically pleasant fashion, but rarely go on to show any evidence. If I think this capability sounds useful to me, it’s not clear how to access it, the documentation assumes I already know. From what I can tell, I first need to actually install Unitful, which includes some useful units, and in particular Quantity types.
Then I can use units in the following fashion:
That is pleasant enough I suppose?
To know how this works, and also how I can invent my own units, I read Erik Engheim or Lyndon White who explain it in depth.
See SampledSignals for a concrete example of how to do things with units, such as the following method definitions to handle time-to-index conversion.
7 Modules and imports
It was not obvious for me how, when creating a package that contains submodules, how to import from the root module.
Suppose the package is called MyPackage
, and MySubmodule
is inside it.
8 Useful macros
9 Destructuring assignment
a.k.a. splatting. Built in. Blingin’ options are provided by the macro Destructure.jl .
10 Writing macros and reflecting
10.1 Intermediate representation
I constantly search for this, so how ’bout I link it? The manual pages I need are Reflection and Metaprogramming. The latter has the parsed and interpolated representation documentation as used in macros. For inspecting my code to see what the language made of it I use @code_lowered
, @code_warntype
, @code_llvm
depending on how much I wish to trade comprehensibility for precision.
10.2 What is this module?
Took me a while to work out that the current module is called @__MODULE__
.
So if I want to look up a function by symbol, for example,
11 Gotchas, tips
Chris Rackauckas mentions 7 Julia gotchas.
Here are some more.
11.1 Implementing standard interfaces for custom types
The type system is logical, although it’s not obvious if you are used to classical OOP. (Not a criticism.)
You want to implement a standard interface on your type so you can, e.g. iterate over it, which commonly looks like this:
or equivalently
iter_result = iterate(iterable)
while iter_result !== nothing
(element, state) = iter_result
# body
iter_result = iterate(iterable, state)
end
Here is an example of that: A range iterator which yields every nth element up to some number of elements could look like
julia> struct EveryNth
n::Int
start::Int
length::Int
end
julia> function Base.iterate(iter::EveryNth, state=(iter.start, 0))
element, count = state
if count >= iter.length
return nothing
end
return (element, (element + iter.n, count + 1))
end
julia> Base.length(iter::EveryNth) = iter.length
julia> Base.eltype(iter::EveryNth) = Int
(If you are lucky you might be able to inherit from AbstractArray.)
It’s weird for me that this requires injecting your methods into another namespace; in this case, Base
. That might feel gross, and it does lead to surprising behaviour that this is how things are done, and some mind-bending namespace resolution rules for methods. Importing one package can magically change the behaviour of another. This monkey patch style (called, in Julia argot, “type piracy”) is everywhere in Julia, and is clearly marked when you write a package, but not when you use the package. Anyway it works fine and I can’t imagine how to handle the multiple dispatch thing better, so deal with it.
11.2 Array slicing may copy
You are using a chunk of an existing array and don’t want to copy? Consider using views for slices, they say, which means not using slice notation but rather the view
function, or the @views
macro. Both these are ugly in different ways so I cross my fingers and hope the compiler can optimise away some of this nonsense.
11.3 Custom containers are scruffy
If you need container types, the idiomatic way to do this is using parametric types and parametric methods and so-called orthogonal design.
The rule is: let the compiler work out the argument types in function definitions, but you should choose the types in variable definitions (i.e. when you are calling said functions)
(Thanks to Chris Rackauckas for clarifying this point for me.)
11.4 Containers of containers need parameterisation
It’s hard to work out what went wrong when you hear Type definition: expected UnionAll, got TypeVar.
12 Parallel Julia
The ClusterManager system provides an OK abstraction for orchestrating a bunch of simultaneously active workers.