This is apparently what we call Bayesian inference these days.
When we say Bayesian *programming*, we might mean a simple
hierarchical model, but
we want to emphasise hope that we might even succeed in
doing inference for very complicated models indeed, possibly ones without
tractable likelihoods of any kind, maybe even
Turing-complete.
*Hope* in this context means something like “we provide the programming
primitives to in principle express the awful crazy likelihood structure of your
complicated problem, although you are on your own in demonstrating any kind of concentration or
convergence for your estimates of its posterior likelihood in the light of data.”

Mostly these tools are based on Markov Chain Monte Carlo sampling which turns out to be a startlingly general way to grind out the necessary calculations. There are other ways, such as classic conjugate priors, variational methods or reparameterisation flows, and many hybrids thereof.

See George Ho of PyMC3/PyMC4 for an in-depth introduction into what might be desirable to solve these problems in practice.

A probabilistic programming framework needs to provide six things:

- A language or API for users to specify a model
- A library of probability distributions and transformations to build the posterior density
- At least one inference algorithm, which either draws samples from the posterior (in the case of Markov Chain Monte Carlo, MCMC) or computes some approximation of it (in the case of variational inference, VI)
- At least one optimizer, which can compute the mode of the posterior density
- An autodifferentiation library to compute gradients required by the inference algorithm and optimizer
- A suite of diagnostics to monitor and analyze the quality of inference

See also Col Carroll’s overview of several trendy frameworks. This seems fresh, i.e. post-2019?, and includes more than I did here.

## Stan

Stan is the inference toolbox for broad classes of
Bayesian model and the *de facto* reference point.
If your problem CAN be handled by Stan, this is a highly recommended option.
Usually seen in concert with brms which makes it easier to use for various standard regression models.

See the Stan notebook.

## Edward/Edward2

From Blei’s lab, leverages trendy deep learning machinery, tensorflow for variational Bayes and such.

This is now baked in to tensorflow as a probabilistic programming interface.

## TensorFlow Probability

The tensorflow entrant. Low-level and messy. Used in Edward2, above, but presumably more basic.

## Pyro

pytorch + bayes = pyro. For rationale, see the pyro launch announcment:

We believe the critical ideas to solve AI will come from a joint effort among a worldwide community of people pursuing diverse approaches. By open sourcing Pyro, we hope to encourage the scientific world to collaborate on making AI tools more flexible, open, and easy-to-use. We expect the current (alpha!) version of Pyro will be of most interest to probabilistic modelers who want to leverage large data sets and deep networks, PyTorch users who want easy-to-use Bayesian computation, and data scientists ready to explore the ragged edge of new technology.

## Numpyro

Numpyro uses jax for autodiff and reputedly comes from the creators of pyro. It has no second-order derivatives as in HMC but looks sleek.

## pyprob

`pyprob`

: (Le, Baydin, and Wood 2017)

pyprob is a PyTorch-based library for probabilistic programming and inference compilation. The main focus of this library is on coupling existing simulation codebases with probabilistic inference with minimal intervention.

The main advantage of pyprob, compared against other probabilistic programming languages like Pyro, is a fully automatic amortized inference procedure based on importance sampling. pyprob only requires a generative model to be specified. Particularly, pyprob allows for efficient inference using inference compilation which trains a recurrent neural network as a proposal network.

In Pyro such an inference network requires the user to explicitly define the control flow of the network, which is due to Pyro running the inference network and generative model sequentially. However, in pyprob the generative model and inference network runs concurrently. Thus, the control flow of the model is directly used to train the inference network. This alleviates the need for manually defining its control flow.

The flagship application seems to be etalumis (Baydin et al. 2019) a probablistic programming ramework with emphasis AFAICT on Bayesian inverse problems.

## Mamba.jl

Mambais an open platform for the implementation and application of MCMC methods to perform Bayesian analysis in julia. The package provides a framework for (1) specification of hierarchical models through stated relationships between data, parameters, and statistical distributions; (2) block-updating of parameters with samplers provided, defined by the user, or available from other packages; (3) execution of sampling schemes; and (4) posterior inference. It is intended to give users access to all levels of the design and implementation of MCMC simulators to particularly aid in the development of new methods.Several software options are available for MCMC sampling of Bayesian models. Individuals who are primarily interested in data analysis, unconcerned with the details of MCMC, and have models that can be fit in JAGS, Stan, or OpenBUGS are encouraged to use those programs.

Mambais intended for individuals who wish to have access to lower-level MCMC tools, are knowledgeable of MCMC methodologies, and have experience, or wish to gain experience, with their application. The package also provides stand-alone convergence diagnostics and posterior inference tools, which are essential for the analysis of MCMC output regardless of the software used to generate it.

## Turing.jl

`Turing.jl`

is a Julia library for (universal) probabilistic programming. Current features include:

- Universal probabilistic programming with an intuitive modelling interface
- Hamiltonian Monte Carlo (HMC) sampling for differentiable posterior distributions
- Particle MCMC sampling for complex posterior distributions involving discrete variables and stochastic control flows
- Gibbs sampling that combines particle MCMC and HMC

It is one of many julia options, and includes flashy MCMC, called `AdvancedHMC.jl`

## Gen

`Gen`

:

`Gen`

simplifies the use of probabilistic modeling and inference, by providing modeling languages in which users express models, and high-level programming constructs that automate aspects of inference.Like some probabilistic programming research languages, Gen includes universal modeling languages that can represent any model, including models with stochastic structure, discrete and continuous random variables, and simulators. However, Gen is distinguished by the flexibility that it affords to users for customizing their inference algorithm.

Gen’s flexible modeling and inference programming capabilities unify symbolic, neural, probabilistic, and simulation-based approaches to modeling and inference, including causal modeling, symbolic programming, deep learning, hierarchical Bayesiam modeling, graphics and physics engines, and planning and reinforcement learning.

It has an impressive talk demonstrating how you would interactively clean data using it.

## Miscellaneous julia options

`DynamicHMC.jl`

does Hamiltonian/NUTS sampling in a raw likelihood setting.

Possibly it is a competitor of
`Klara.jl`

,
the Juliastats MCMC.

Miletus is a financial product adn ter-structure modeling package that is available for quant stuff in Julia as part of the paid packages offerings in finance. Although it looks like it is also freely available?

## PyMC3/PyMC4

Pymc3 is python+Theano. PyMC4 will depend upon Tensorflow.

See Chris Fonnesbeck’s example in python.

Thomas Wiecki, Bayesian Deep Learning shows how to some variants with PyMC3.

## Greta

greta models are written right in R, so there’s no need to learn another language like BUGS or Stan

greta uses Google TensorFlow

I wonder *how* it uses Google Tensorflow.

## Soss.jl

Soss is a library for probabilistic programming.

Let’s jump right in with a simple linear model:

```
using Soss
m = @model X begin
β ~ Normal() |> iid(size(X,2))
y ~ For(eachrow(X)) do x
Normal(x’ * β, 1)
end
end;
```

In Soss, models are first-class and function-like, and “applying” a model to its arguments gives a joint distribution.

Just a few of the things we can do in Soss:

- Sample from the (forward) model
- Condition a joint distribution on a subset of parameters
- Have arbitrary Julia values (yes, even other models) as inputs or outputs of a model
- Build a new model for the predictive distribution, for assigning parameters to particular values

## Inferpy

InferPy seems to be a higher-level competitor to Edward2?

## Zhusuan

ZhuSuan is a python probabilistic programming library for Bayesian deep learning, which conjoins the complimentary advantages of Bayesian methods and deep learning. ZhuSuan is built upon Tensorflow. Unlike existing deep learning libraries, which are mainly designed for deterministic neural networks and supervised tasks, ZhuSuan provides deep learning style primitives and algorithms for building probabilistic models and applying Bayesian inference. The supported inference algorithms include:

- Variational inference with programmable variational posteriors, various objectives and advanced gradient estimators (SGVB, REINFORCE, VIMCO, etc.).
- Importance sampling for learning and evaluating models, with programmable proposals.
- Hamiltonian Monte Carlo (HMC) with parallel chains, and optional automatic parameter tuning.

## Church/Anglican

Church is a general-purpose Turing-complete Monte Carlo lisp-derivative, which is unbearably slow but does some reputedly cute tricks with modeling human problem-solving, and other likelihood-free methods, according to creators Noah Goodman and Joshua Tenenbaum.

See also Anglican, which is the same but different, being built in clojure, and hence also leveraging browser Clojurescript.

## WebPPL

WebPPL is a successor to Church designed as a teaching language for probabilistic reasoning in the browser. If you like Javascript ML.

## BAT

See also BAT the Bayesian Analysis Toolkit, which does sophisticated Bayes modelling although AFAICT uses a fairly basic Metropolis-Hasting Sampler?

## References

*Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation*, 221–36. PLDI 2019. New York, NY, USA: ACM. https://doi.org/10.1145/3314221.3314642.

*Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation*, 571–85. PLDI 2018. New York, NY, USA: ACM. https://doi.org/10.1145/3192366.3192399.

*Proceedings of the 2Nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages*, 52–57. MAPL 2018. New York, NY, USA: ACM. https://doi.org/10.1145/3211346.3211350.

*Journal of Educational and Behavioral Statistics*40 (5): 530–43. https://doi.org/10.3102/1076998615606113.

*Journal of Statistical Software*76 (1). https://doi.org/10.18637/jss.v076.i01.

*Proceedings of the ACM on Programming Languages*3 (January): 1–30. https://doi.org/10.1145/3290348.

*Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS)*, 54:1338–48. Proceedings of Machine Learning Research. Fort Lauderdale, FL, USA: PMLR. http://arxiv.org/abs/1610.09900.

*PeerJ Computer Science*2 (April): e55. https://doi.org/10.7717/peerj-cs.55.

*ICLR*. http://arxiv.org/abs/1701.03757.