Bayesian nonparametric statistics

Updating more dimensions than datapoints



It is hard to explain what happens to the posterior in this case

Useful stochastic processes

A map of popular processes used in Bayesian nonparametrics from Xuan, Lu, and Zhang (2020)

Dirichlet priors, other measure priors, Gaussian Process regression, reparameterisations etc. πŸ—

Posterior updates in infinite dimesnions

For now, this is just a bookmark to the general measure theoretic notation that unifies, in principle, the various Bayesian nonparametric methods. A textbook on general theory is Schervish (2012). Chapter 1 of Matthews (2017) is a compact introduction.

Particular applications are outlined in Matthews (2017) (GP regression) and Stuart (2010) (inverse problems).

A brief introduction the kind of measure-theoretic notation we need in the infinite-dimensional Hilbert space settings is in Alexanderian (2021), giving Bayes’ formula as \[ \frac{d \mu_{\text {post }}^{y}}{d \mu_{\text {pr }}} \propto \pi_{\text {like }}(\boldsymbol{y} \mid m), \] where the left hand side is the Radon-Nikodym derivative of \(\mu_{\text {post }}^{y}\) with respect to \(\mu_{\text {pr }}\).

They observe

Note that in the finite-dimensional setting the abstract form of the Bayes’ formula above can be reduced to the familiar form of Bayes’ formula in terms of PDFs. Specifically, working in finite-dimensions, with \(\mu_{\mathrm{pr}}\) and \(\mu_{\mathrm{post}}^{y}\) that are absolutely continuous with respect to the Lebesgue measure \(\lambda\), the prior and posterior measures admit Lebesgue densities \(\pi_{\mathrm{pr}}\) and \(\pi_{\text {post }}\), respectively. Then, we note \[ \pi_{\mathrm{post}}(m \mid \boldsymbol{y})=\frac{d \mu_{\mathrm{post}}^{y}}{d \lambda}(m)=\frac{d \mu_{\mathrm{post}}^{y}}{d \mu_{\mathrm{pr}}}(m) \frac{d \mu_{\mathrm{pr}}}{d \lambda}(m) \propto \pi_{\mathrm{like}}(\boldsymbol{y} \mid m) \pi_{\mathrm{pr}}(m) \]

Bayesian consistency

Consistency turns out to be potentially tricky for functional models. I am not an expert on consistency, but see Cox (1993) for some warnings about what can go wrong and Florens and Simoni (2016); Knapik, van der Vaart, and van Zanten (2011) for some remedies. tl;dr posterior credible intervals arising from over-tight priors may never cover the frequentist estimate. Further reading on this is in some classic refs [Diaconis and Freedman (1986); Freedman (1999);KleijnMisspecification2006].

References

Alexanderian, Alen. 2021. β€œOptimal Experimental Design for Infinite-Dimensional Bayesian Inverse Problems Governed by PDEs: A Review.” arXiv:2005.12998 [Math], January.
Bui-Thanh, Tan, Omar Ghattas, James Martin, and Georg Stadler. 2013. β€œA Computational Framework for Infinite-Dimensional Bayesian Inverse Problems Part I: The Linearized Case, with Application to Global Seismic Inversion.” SIAM Journal on Scientific Computing 35 (6): A2494–2523.
Bui-Thanh, Tan, and Quoc P. Nguyen. 2016. β€œFEM-Based Discretization-Invariant MCMC Methods for PDE-Constrained Bayesian Inverse Problems.” Inverse Problems & Imaging 10 (4): 943.
Cox, Dennis D. 1993. β€œAn Analysis of Bayesian Inference for Nonparametric Regression.” The Annals of Statistics 21 (2): 903–23.
Diaconis, Persi, and David Freedman. 1986. β€œOn the Consistency of Bayes Estimates.” The Annals of Statistics 14 (1): 1–26.
Florens, Jean-Pierre, and Anna Simoni. 2016. β€œRegularizing Priors for Linear Inverse Problems.” Econometric Theory 32 (1): 71–121.
Freedman, David. 1999. β€œWald Lecture: On the Bernstein-von Mises Theorem with Infinite-Dimensional Parameters.” The Annals of Statistics 27 (4): 1119–41.
Kleijn, B. J. K., and A. W. van der Vaart. 2006. β€œMisspecification in Infinite-Dimensional Bayesian Statistics.” The Annals of Statistics 34 (2): 837–77.
Knapik, B. T., A. W. van der Vaart, and J. H. van Zanten. 2011. β€œBayesian Inverse Problems with Gaussian Priors.” The Annals of Statistics 39 (5).
Lee, Hyun Keun, Chulan Kwon, and Yong Woon Kim. 2022. β€œStatistical Inference as Green’s Functions.” arXiv.
MacEachern, Steven N. 2016. β€œNonparametric Bayesian Methods: A Gentle Introduction and Overview.” Communications for Statistical Applications and Methods 23 (6): 445–66.
Matthews, Alexander Graeme de Garis. 2017. β€œScalable Gaussian Process Inference Using Variational Methods.” Thesis, University of Cambridge.
Petra, Noemi, James Martin, Georg Stadler, and Omar Ghattas. 2014. β€œA Computational Framework for Infinite-Dimensional Bayesian Inverse Problems, Part II: Stochastic Newton MCMC with Application to Ice Sheet Flow Inverse Problems.” SIAM Journal on Scientific Computing 36 (4): A1525–55.
Rousseau, Judith. 2016. β€œOn the Frequentist Properties of Bayesian Nonparametric Methods.” Annual Review of Statistics and Its Application 3 (1): 211–31.
Schervish, Mark J. 2012. Theory of Statistics. Springer Series in Statistics. New York, NY: Springer Science & Business Media.
Stuart, A. M. 2010. β€œInverse Problems: A Bayesian Perspective.” Acta Numerica 19: 451–559.
SzabΓ³, Botond, Aad van der Vaart, and Harry van Zanten. 2013. β€œFrequentist Coverage of Adaptive Nonparametric Bayesian Credible Sets.” arXiv:1310.4489 [Math, Stat], October.
Xuan, Junyu, Jie Lu, and Guangquan Zhang. 2020. β€œA Survey on Bayesian Nonparametric Learning.” ACM Computing Surveys 52 (1): 1–36.
Zhou, Mingyuan, Haojun Chen, John Paisley, Lu Ren, Guillermo Sapiro, and Lawrence Carin. 2009. β€œNon-Parametric Bayesian Dictionary Learning for Sparse Image Representations.” In Proceedings of the 22nd International Conference on Neural Information Processing Systems, 22:2295–2303. NIPS’09. Red Hook, NY, USA: Curran Associates Inc.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.