# Bayesian nonparametric statistics

Updating more dimensions than datapoints

May 30, 2016 — April 7, 2022

## 1 Useful stochastic processes

Dirichlet priors, other measure priors, Gaussian Process regression, reparameterisations etc. 🏗

## 2 Posterior updates in infinite dimensions

For now, this is just a bookmark to the general measure theoretic notation that unifies, in principle, the various Bayesian nonparametric methods. A textbook on general theory is Schervish (2012). Chapter 1 of Matthews (2017) is a compact introduction.

Particular applications are outlined in Matthews (2017) (GP regression) and Stuart (2010) (inverse problems).

A brief introduction the kind of measure-theoretic notation we need in the infinite-dimensional Hilbert space settings is in Alexanderian (2021), giving Bayes’ formula as \[ \frac{d \mu_{\text {post }}^{y}}{d \mu_{\text {pr }}} \propto \pi_{\text {like }}(\boldsymbol{y} \mid m), \] where the left hand side is the Radon-Nikodym derivative of \(\mu_{\text {post }}^{y}\) with respect to \(\mu_{\text {pr }}\).

They observe

Note that in the finite-dimensional setting the abstract form of the Bayes’ formula above can be reduced to the familiar form of Bayes’ formula in terms of PDFs. Specifically, working in finite-dimensions, with \(\mu_{\mathrm{pr}}\) and \(\mu_{\mathrm{post}}^{y}\) that are absolutely continuous with respect to the Lebesgue measure \(\lambda\), the prior and posterior measures admit Lebesgue densities \(\pi_{\mathrm{pr}}\) and \(\pi_{\text {post }}\), respectively. Then, we note \[ \pi_{\mathrm{post}}(m \mid \boldsymbol{y})=\frac{d \mu_{\mathrm{post}}^{y}}{d \lambda}(m)=\frac{d \mu_{\mathrm{post}}^{y}}{d \mu_{\mathrm{pr}}}(m) \frac{d \mu_{\mathrm{pr}}}{d \lambda}(m) \propto \pi_{\mathrm{like}}(\boldsymbol{y} \mid m) \pi_{\mathrm{pr}}(m) \]

## 3 Bayesian consistency

Consistency turns out to be potentially tricky for functional models. I am not an expert on consistency, but see Cox (1993) for some warnings about what can go wrong and Florens and Simoni (2016);Knapik, van der Vaart, and van Zanten (2011) for some remedies. **tl;dr** posterior credible intervals arising from over-tight priors may never cover the frequentist estimate. Further reading on this is in some classic refs (Diaconis and Freedman 1986; Freedman 1999; Kleijn and van der Vaart 2006).

## 4 Incoming

## 5 References

*arXiv:2005.12998 [Math]*.

*Bernoulli*.

*SIAM Journal on Scientific Computing*.

*Inverse Problems & Imaging*.

*The Annals of Statistics*.

*The Annals of Statistics*.

*Econometric Theory*.

*The Annals of Statistics*.

*The Annals of Statistics*.

*The Annals of Statistics*.

*Communications for Statistical Applications and Methods*.

*Bayesian Analysis*.

*SIAM Journal on Scientific Computing*.

*Annual Review of Statistics and Its Application*.

*Theory of Statistics*. Springer Series in Statistics.

*Acta Numerica*.

*arXiv:1310.4489 [Math, Stat]*.

*ACM Computing Surveys*.

*Proceedings of the 22nd International Conference on Neural Information Processing Systems*. NIPS’09.