Julia ML etc

Stats/ML and also DSP in Julia.

Statistics, probability and data analysis

Hayden Klok and Yoni Nazarathy are writing a free Julia Statistics textbook which seems a thorough introduction to statistics as well as Julia, albeit statistics in a classical frame that won’t be fashionable with either your learning theory or Bayesian types.

A good starting point for doing stuff is JuliaStats which organisation produces many statistics megapackages, for kernel density estimates, generalised linear models, loess etc. Install them all using Statskit:

using StatsKit

Less well known but handy is F. Bagge Carlson’s TotalLeastSquares which does neat errors-in-variables models Bagge Carlson, F., "Machine Learning and System Identification for Estimation in Physical Systems" (PhD Thesis 2018).

Data frames

The workhorse data structure of statistics.

Data frames are provided by DataFrames.jl.

There are some older ones you might encounter such as DataTables.jl which are subtly incompatible in tedious ways which these days we can ignore. Legacy compatability is provided by IterableTables.jl to translate where needed between these and many more useful other data sources.

One can access DataFrames (and DataTables and SQL databases and streaming data sources) using Query.jl. DataFramesMeta has also been recommended.

You can load a lot of the R standard datasets using RDatasets.

using RDatasets
iris = dataset("datasets", "iris")
neuro = dataset("boot", "neuro")

DataFrames taste better with InvertedIndices.

They can get tidyverse-like behaviour via the Pipe package.

Frequentist statistics

Lasso and other sparse regressions are available in Lasso.jl which reimplements the lasso algorithm in pure Julia, GLMNET.jl which wrap the classic Friedman FORTAN code for same. There is also (functionality unattested) an orthogonal matching pursuit one called OMP.jl but that algorithm is simple enough to bang out oneself in an afternoon, so no stress if it doesn’t work. Incremental/online versions of (presumably exponential family) statistics are in OnlineStats. MixedModels.jl

is a Julia package providing capabilities for fitting and examining linear and generalized linear mixed-effect models. It is similar in scope to the lme4 package for R.

Probabilistic programming

Probabilistic programming! Bayesian inference considered broadly! Several option under the probabilistic programming page are based on julia, specifically, Turing.jl, Mamba.jl, Gen, DynamicHMC, Klara.jl, and probably others.

Machine learning

Let’s put the automatic differentiation, the optimizers and the samplers together to do differentiable learning!

The deep learning toolkits have shorter feature lists than the lengthy ones of those fancy python/C++ libraries (e.g. mobile app building, cuDNN-backed optimisations are all less present in julia libraries) But maybe elegance/performance of Julia makes some of those features irrelevant? I for one don’t care about most of those because I’m a researcher not a deployer.

Having said that, Tensorflow.jl gets all the features, because it invokes C++ tensorflow. Surely one misses the benefit of Julia this way, since there are two different array-processing infrastructures to data between, and a different approach to JIT versus pre-compiled execution. Or no?

Flux.jl sounds like a reimplementation of Tensorflow-style differentiable programming inside Julia, which strikes me as the right way to do this to benefit from the end-to-end-optimised design philosophy of Julia.

Flux is a library for machine learning. It comes “batteries-included” with many useful tools built in, but also lets you use the full power of the Julia language where you need it. The whole stack is implemented in clean Julia code (right down to the GPU kernels) and any part can be tweaked to your liking.

It’s missing some features of Tensorflow, bu includes compensatory suprising/unique feature combinations. GPU support supposes that CuArrays can represent all the operations I need and will perform them optimally, and that I don’t need any fancy DNN-specific GPU optimizations. I suspect this requires careful footwork to function. This is dubious — For example CuArrays do not support all the FFT operations I want, such as the Discrete Cosine Transform. However, maybe it is usually enough.

Its end-to-end Julia philosophy is supports neat tricks DiffEqFlux — see below — which makes Neural ODEs sort-of simple to create.

Knet.jl is another deep learning library that claims to show the ease of implementing deep learning frameworks in Julia.

Alternatively, Mocha.jl is a belt-and-braces deep learning thing, with a library of pre-defined layers deprecated and unmaintained.

If one were aiming to do that, why not do something left-field like use the dynamical systems approach to deep learning? This neat trick was popularised by Haber and Ruthotto et al, who have released some of their models as Meganet.jl. I’m curious to see how they work. (seems to have paused).

There are various Gaussian Process options.

MLJ is a scikit-learn-like pipeline for data analysis in Julia which standardises model composition automates some of the training etc. It has various adaptors for other ML systems via MLJModels.

See also * FluxTraining.jl * FluxML/FastAI.jl: Port of FastAI V2 API to Julia

Matrix Factorisation

NMF.jl contains reference implementations of non-negative matrix factorisation.

Laplacians.jl by Dan Spielman et al is a matrix factorisation toolkit especially for Laplacian (graph adjacency) matrices.

Differentiating, optimisation


JuMP support many types of optimisation, including over non-continuous domains, and is part of the JuliaOpt family of confusingly diverse optimizers, which invoke various sub-families of optimizers. The famous NLOpt solvers comprise one such class, and they can additionally be invoked separately.

Unlike NLOpt and the JuMP family, Optim.jl (part of JuliaNLSolvers, a different family entirely) solves optimisation problems purely inside Julia. It has nieces and nephews such as LsqFit for Levenberg-Marquardt non-linear least squares fits. Optim.jl will automatically invoke ForwardDiff. Assumes mostly unconstrained problems.

Krylov.jl is a collection of Krylov-type iterative method for large iterative linear and least-squares objectives.


Julia is a hotbed of autodiff for technical and community reasons. Such a hotbed that it’s worth discussing in the autodiff notebook.

Closely related, projects like ModelingToolkit.jl blur the lines between equations and coding, and allow easy definition of differentiable or probabilistic programming.


Chris Rauckackas is a veritable wizard with this stuff; read his blog.

Here is a tour of fun tricks with stochastic PDEs. There is a lot of tooling for this; DiffEqOperators … does something. DiffEqFlux (EZ neural ODEs works with Flux and claims to make Neural ODEs simple. The implementation of these things in python, for the award-winning NeurIPS paper that made them famous was a nightmare. +1 for Julia here. The neural SDE section is mostly julia; Go check that out.

There are many PDE options. Gridap seems fresh.

Signal processing

DSP.jl has been split off from core and now needs to be installed separately. Also DirectConvolutions has sensible convolution code.

FFTs are provided by AbstractFFTs, which in-principle wraps many FFT implementations. I don’t know if there is a GPU implementation yet, but there for sure is the classic CPU implementation provided by FFTW.jl which uses FFTW internally.

As for how to use these things, Numerical tours of data sciences has a Julia edition with lots of signal processing conent.

JuliaAudio processes audio. They recommend PortAudio.jl as a real time soundcard interface, which looks sorta simple. See rkat’s example of how this works. There are useful abstractions like SampledSignals to load audio and keep the data and signal rate bundled together. Although, as SampledSignal maintainer Spencer Russell points out, AxisArrays might be the right data structure for sample signals, and you could use SampledSignals purely for IO, and ignore its data structures thereafter.

Images.jl processes images.


Low discrepancy and other QMC stuff. Mostly I want low discrepancy sequences. There are two options with near identical interfaces; I’m not sure of the differences.

Sobol.jl claims to have been performance profiled:

] add Sobol
using Sobol
s = SobolSeq(2)
# Then
x = next!(s)


] add https://github.com/PieterjanRobbe/QMC.jl
using QMC
lat = LatSeq(2)