# Bayesian nonparametric statistics

Updating more dimensions than datapoints

May 30, 2016 — April 7, 2022

Bayes
functional analysis
Gaussian
generative
how do science
Monte Carlo
nonparametric
statistics
stochastic processes

## 1 Useful stochastic processes

Dirichlet priors, other measure priors, Gaussian Process regression, reparameterisations etc. 🏗

## 2 Posterior updates in infinite dimensions

For now, this is just a bookmark to the general measure theoretic notation that unifies, in principle, the various Bayesian nonparametric methods. A textbook on general theory is Schervish (2012). Chapter 1 of Matthews (2017) is a compact introduction.

Particular applications are outlined in Matthews (2017) (GP regression) and Stuart (2010) (inverse problems).

A brief introduction the kind of measure-theoretic notation we need in the infinite-dimensional Hilbert space settings is in Alexanderian (2021), giving Bayes’ formula as $\frac{d \mu_{\text {post }}^{y}}{d \mu_{\text {pr }}} \propto \pi_{\text {like }}(\boldsymbol{y} \mid m),$ where the left hand side is the Radon-Nikodym derivative of $$\mu_{\text {post }}^{y}$$ with respect to $$\mu_{\text {pr }}$$.

They observe

Note that in the finite-dimensional setting the abstract form of the Bayes’ formula above can be reduced to the familiar form of Bayes’ formula in terms of PDFs. Specifically, working in finite-dimensions, with $$\mu_{\mathrm{pr}}$$ and $$\mu_{\mathrm{post}}^{y}$$ that are absolutely continuous with respect to the Lebesgue measure $$\lambda$$, the prior and posterior measures admit Lebesgue densities $$\pi_{\mathrm{pr}}$$ and $$\pi_{\text {post }}$$, respectively. Then, we note $\pi_{\mathrm{post}}(m \mid \boldsymbol{y})=\frac{d \mu_{\mathrm{post}}^{y}}{d \lambda}(m)=\frac{d \mu_{\mathrm{post}}^{y}}{d \mu_{\mathrm{pr}}}(m) \frac{d \mu_{\mathrm{pr}}}{d \lambda}(m) \propto \pi_{\mathrm{like}}(\boldsymbol{y} \mid m) \pi_{\mathrm{pr}}(m)$

## 3 Bayesian consistency

Consistency turns out to be potentially tricky for functional models. I am not an expert on consistency, but see Cox (1993) for some warnings about what can go wrong and Florens and Simoni (2016);Knapik, van der Vaart, and van Zanten (2011) for some remedies. tl;dr posterior credible intervals arising from over-tight priors may never cover the frequentist estimate. Further reading on this is in some classic refs .

## 5 References

Alexanderian. 2021. arXiv:2005.12998 [Math].
Broderick, Wilson, and Jordan. 2018. Bernoulli.
Bui-Thanh, Ghattas, Martin, et al. 2013. SIAM Journal on Scientific Computing.
Bui-Thanh, and Nguyen. 2016. Inverse Problems & Imaging.
Cox. 1993. The Annals of Statistics.
Diaconis, and Freedman. 1986. The Annals of Statistics.
Florens, and Simoni. 2016. Econometric Theory.
Freedman. 1999. The Annals of Statistics.
Kleijn, and van der Vaart. 2006. The Annals of Statistics.
Knapik, van der Vaart, and van Zanten. 2011. The Annals of Statistics.
Knoblauch, Jewson, and Damoulas. 2019.
Lee, Kwon, and Kim. 2022.
MacEachern. 2016. Communications for Statistical Applications and Methods.
Matthews. 2017.
Nguyen, Trungtin, Forbes, Arbel, et al. 2023.
Nguyen, Tin D., Huggins, Masoero, et al. 2023. Bayesian Analysis.
Orbanz. 2009.
———. 2011.
Orbanz, and Teh. n.d. “Bayesian Nonparametric Models.”
Petra, Martin, Stadler, et al. 2014. SIAM Journal on Scientific Computing.
Rousseau. 2016. Annual Review of Statistics and Its Application.
Schervish. 2012. Theory of Statistics. Springer Series in Statistics.
Stuart. 2010. Acta Numerica.
Szabó, van der Vaart, and van Zanten. 2013. arXiv:1310.4489 [Math, Stat].
Xuan, Lu, and Zhang. 2020. ACM Computing Surveys.
Zhou, Chen, Paisley, et al. 2009. In Proceedings of the 22nd International Conference on Neural Information Processing Systems. NIPS’09.