# Simulation-based inference

December 24, 2014 — November 23, 2023

Warning: This is chaos right now, I’m consolidating notebooks.

Suppose we have access to a simulator of a system of interest, and if we knew the “right” inputs we could get behaviour from it which matched some observations we have made of a related phenomenon in the world. Supposed further that the simulator is pretty messy so we do not have access to the likelihood. Can we still do statistics, e.g. inferring the parameters of the simulator which would give rise to the observations we have made?

Oh my, what a variety of ways we can try.

There are various families of methods here; some try to work purely in samples; others try to approximate the likelihood. I am not sure how all the methods relate to one another. But let us mention some.

Cranmer, Brehmer, and Louppe (2020) attempt to develop a taxonomy **?@fig-cranmer-likelihood-free-taxonomy**. They certainly make likelihood-free methods sound popular in machine learning for physics.

## 1 Neural likelihood estimation

As summarised in Cranmer, Brehmer, and Louppe (2020). Incorporating

- Neural Posterior Estimation (amortized NPE and sequential SNPE), (Deistler, Goncalves, and Macke 2022; Glöckler, Deistler, and Macke 2022; Greenberg, Nonnenmacher, and Macke 2019; Papamakarios and Murray 2016)
- Neural Likelihood Estimation ((S)NLE), (Boelts et al. 2022; Lueckmann et al. 2017; Papamakarios, Sterratt, and Murray 2019) and
- Neural Ratio Estimation ((S)NRE) (Delaunoy et al. 2022; Durkan, Murray, and Papamakarios 2020; Hermans, Begy, and Louppe 2020; Miller, Weniger, and Forré 2022) (see also density ratio)

See the Mackelab sbi page for several implementations:

Goal: Algorithmically identify mechanistic models which are consistent with data.Each of the methods above needs three inputs: A candidate mechanistic model, prior knowledge or constraints on model parameters, and observational data (or summary statistics thereof).

The methods then proceed by

- sampling parameters from the prior followed by simulating synthetic data from these parameters,
- learning the (probabilistic) association between data (or data features) and underlying parameters, i.e. to learn statistical inference from simulated data. The way in which this association is learned differs between the above methods, but all use deep neural networks.
- This learned neural network is then applied to empirical data to derive the full space of parameters consistent with the data and the prior, i.e. the posterior distribution. High posterior probability is assigned to parameters which are consistent with both the data and the prior, low probability to inconsistent parameters. While SNPE directly learns the posterior distribution, SNLE and SNRE need an extra MCMC sampling step to construct a posterior.
- If needed, an initial estimate of the posterior can be used to adaptively generate additional informative simulations.

Code here: mackelab/sbi: Simulation-based inference in PyTorch

Compare to contrastive learning.

## 2 Indirect inference

A.k.a the *auxiliary method*.

In the (older?) frequentist framing you can get through an undergraduate program in statistics without simulation based inference arising. However, I am pretty sure it is required for economists and ecologists.

Quoting Cosma:

[…] your model is too complicated for you to appeal to any of the usual estimation methods of statistics. […] there is no way to even calculate the likelihood of a given data set \(x_1,x_2,…x_t\equiv x_t\) under parameters \(\theta\) in closed form, which would rule out even numerical likelihood maximization, to say nothing of Bayesian methods […] Yet you can simulate; it seems like there should be some way of saying whether the simulations look like the data. This is where indirect inference comes in […] Introduce a new model, called the “auxiliary model”, which is mis-specified and typically not even generative, but is easily fit to the data, and to the data alone. (By that last I mean that you don’t have to impute values for latent variables, etc., etc., even though you might know those variables exist and are causally important.) The auxiliary model has its own parameter vector \(\beta\), with an estimator \(\hat{\beta}\). These parameters describe aspects of the distribution of observables, and the idea of indirect inference is that we can estimate the generative parameters \(\theta\) by trying to match those aspects of observations, by trying to match the auxiliary parameters.

Aaron King’s lab at UMichigan stamped its mark on a lot of this research. One wonders whether the optimal summary statistic can be learned from the data. Apparently yes?.

I gather the pomp R package does some simulation-based inference, but I have not checked in for a while so there might be broader and/or fresher options.

## 3 Scoring rules

See scoring rules (Gneiting and Raftery 2007; Pacchiardi and Dutta 2022). NB, these are calibration scores, not Fisher scores.

### 3.1 Energy distances

I thought I knew what this was but I think not. The fact there are so many grandiose publications here (Gneiting and Raftery 2007; Székely and Rizzo 2013, 2017) leads me to suspect there is more going on than the obvious? TBC.

### 3.2 MMD

A particularly convenient discrepancy to use for simulation-based problems is the MMD, because it can be evaluated without reference to a density. See Maximum Mean Discrepancy.

## 4 Approximate Bayesian Computation

## 5 Incoming

## 6 References

*Entropy*.

*Proceedings of the National Academy of Sciences*.

*arXiv:1702.05390 [Physics, Stat]*.

*eLife*.

*Proceedings of the National Academy of Sciences*.

*The Annals of Applied Statistics*.

*Journal of The Royal Society Interface*.

*Ecology*.

*Journal of Statistical Software*.

*Proceedings of the National Academy of Sciences*.

*arXiv:2102.07850 [Cs, Stat]*.

*Biometrika*.

*Proceedings of the National Academy of Sciences*.

*The Econometrics Journal*.

*Biometrika*.

*Computer Physics Communications*.

*arXiv:2202.04744 [Cs, Stat]*.

*Bayesian Analysis*.

*Journal of Econometrics*, The interface between econometrics and economic theory,.

*arXiv:2103.02407 [Stat]*.

*Proceedings of the 37th International Conference on Machine Learning*. ICML’20.

*Statistical Science*.

*arXiv:1902.03175 [Cs, Stat]*.

*arXiv:1501.01265 [Stat]*.

*Journal of Econometrics*.

*Econometric Theory*.

*Journal of the American Statistical Association*.

*Journal of the American Statistical Association*.

*eLife*.

*Journal of Econometrics*.

*Journal of Applied Econometrics*.

*Proceedings of the 36th International Conference on Machine Learning*.

*Bayesian Analysis*.

*The Journal of Machine Learning Research*.

*Journal of The Royal Society Interface*.

*arXiv:1903.04057 [Cs, Stat]*.

*The Annals of Statistics*.

*Proceedings of the National Academy of Sciences*.

*Statistical Science*.

*Ecological Monographs*.

*Symposium on Advances in Approximate Bayesian Inference*.

*AISTATS*.

*Proceedings of the 31st International Conference on Neural Information Processing Systems*. NIPS’17.

*Proceedings of the 32nd International Conference on Neural Information Processing Systems*. NIPS’18.

*arXiv:2104.07359 [Math, Stat]*.

*Mathematical Methods of Statistics 19*.

*Statistics and Computing*.

*arXiv:2104.03889 [Stat]*.

*Advances in Neural Information Processing Systems 29*.

*Journal of Machine Learning Research*.

*Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics*.

*Biometrika*.

*arXiv:2011.08644 [Stat]*.

*Handbook of Approximate Bayesian Computation*.

*Journal of Applied Econometrics*.

*The New Palgrave Dictionary of Economics*.

*arXiv:1808.00973 [Hep-Ph, Physics:physics, Stat]*.

*Journal of Statistical Planning and Inference*.

*Annual Review of Statistics and Its Application*.

*Nature*.

*Computational Statistics*.