Bayesian nonparametric statistics

Updating more dimensions than datapoints

May 30, 2016 — April 7, 2022

Bayes
functional analysis
Gaussian
generative
how do science
Monte Carlo
nonparametric
statistics
stochastic processes
Figure 1: It is hard to explain what happens to the posterior in this case

1 Useful stochastic processes

Figure 2: A map of popular processes used in Bayesian nonparametrics from Xuan, Lu, and Zhang (2020)

Dirichlet priors, other measure priors, Gaussian Process regression, reparameterisations etc. 🏗

2 Posterior updates in infinite dimensions

For now, this is just a bookmark to the general measure theoretic notation that unifies, in principle, the various Bayesian nonparametric methods. A textbook on general theory is Schervish (2012). Chapter 1 of Matthews (2017) is a compact introduction.

Particular applications are outlined in Matthews (2017) (GP regression) and Stuart (2010) (inverse problems).

A brief introduction the kind of measure-theoretic notation we need in the infinite-dimensional Hilbert space settings is in Alexanderian (2021), giving Bayes’ formula as \[ \frac{d \mu_{\text {post }}^{y}}{d \mu_{\text {pr }}} \propto \pi_{\text {like }}(\boldsymbol{y} \mid m), \] where the left hand side is the Radon-Nikodym derivative of \(\mu_{\text {post }}^{y}\) with respect to \(\mu_{\text {pr }}\).

They observe

Note that in the finite-dimensional setting the abstract form of the Bayes’ formula above can be reduced to the familiar form of Bayes’ formula in terms of PDFs. Specifically, working in finite-dimensions, with \(\mu_{\mathrm{pr}}\) and \(\mu_{\mathrm{post}}^{y}\) that are absolutely continuous with respect to the Lebesgue measure \(\lambda\), the prior and posterior measures admit Lebesgue densities \(\pi_{\mathrm{pr}}\) and \(\pi_{\text {post }}\), respectively. Then, we note \[ \pi_{\mathrm{post}}(m \mid \boldsymbol{y})=\frac{d \mu_{\mathrm{post}}^{y}}{d \lambda}(m)=\frac{d \mu_{\mathrm{post}}^{y}}{d \mu_{\mathrm{pr}}}(m) \frac{d \mu_{\mathrm{pr}}}{d \lambda}(m) \propto \pi_{\mathrm{like}}(\boldsymbol{y} \mid m) \pi_{\mathrm{pr}}(m) \]

3 Bayesian consistency

Consistency turns out to be potentially tricky for functional models. I am not an expert on consistency, but see Cox (1993) for some warnings about what can go wrong and Florens and Simoni (2016);Knapik, van der Vaart, and van Zanten (2011) for some remedies. tl;dr posterior credible intervals arising from over-tight priors may never cover the frequentist estimate. Further reading on this is in some classic refs (Diaconis and Freedman 1986; Freedman 1999; Kleijn and van der Vaart 2006).

4 Incoming

5 References

Alexanderian. 2021. Optimal Experimental Design for Infinite-Dimensional Bayesian Inverse Problems Governed by PDEs: A Review.” arXiv:2005.12998 [Math].
Broderick, Wilson, and Jordan. 2018. Posteriors, Conjugacy, and Exponential Families for Completely Random Measures.” Bernoulli.
Bui-Thanh, Ghattas, Martin, et al. 2013. A Computational Framework for Infinite-Dimensional Bayesian Inverse Problems Part I: The Linearized Case, with Application to Global Seismic Inversion.” SIAM Journal on Scientific Computing.
Bui-Thanh, and Nguyen. 2016. FEM-Based Discretization-Invariant MCMC Methods for PDE-Constrained Bayesian Inverse Problems.” Inverse Problems & Imaging.
Cox. 1993. An Analysis of Bayesian Inference for Nonparametric Regression.” The Annals of Statistics.
Diaconis, and Freedman. 1986. On the Consistency of Bayes Estimates.” The Annals of Statistics.
Florens, and Simoni. 2016. Regularizing Priors for Linear Inverse Problems.” Econometric Theory.
Freedman. 1999. Wald Lecture: On the Bernstein-von Mises Theorem with Infinite-Dimensional Parameters.” The Annals of Statistics.
Kleijn, and van der Vaart. 2006. Misspecification in Infinite-Dimensional Bayesian Statistics.” The Annals of Statistics.
Knapik, van der Vaart, and van Zanten. 2011. Bayesian Inverse Problems with Gaussian Priors.” The Annals of Statistics.
Knoblauch, Jewson, and Damoulas. 2019. Generalized Variational Inference: Three Arguments for Deriving New Posteriors.”
Lee, Kwon, and Kim. 2022. Statistical Inference as Green’s Functions.”
MacEachern. 2016. Nonparametric Bayesian Methods: A Gentle Introduction and Overview.” Communications for Statistical Applications and Methods.
Matthews. 2017. Scalable Gaussian Process Inference Using Variational Methods.”
Nguyen, Trungtin, Forbes, Arbel, et al. 2023. Bayesian Nonparametric Mixture of Experts for Inverse Problems.”
Nguyen, Tin D., Huggins, Masoero, et al. 2023. Independent Finite Approximations for Bayesian Nonparametric Inference.” Bayesian Analysis.
Orbanz. 2009. Functional Conjugacy in Parametric Bayesian Models.”
———. 2011. Conjugate Projective Limits.”
Orbanz, and Teh. n.d. “Bayesian Nonparametric Models.”
Petra, Martin, Stadler, et al. 2014. A Computational Framework for Infinite-Dimensional Bayesian Inverse Problems, Part II: Stochastic Newton MCMC with Application to Ice Sheet Flow Inverse Problems.” SIAM Journal on Scientific Computing.
Rousseau. 2016. On the Frequentist Properties of Bayesian Nonparametric Methods.” Annual Review of Statistics and Its Application.
Schervish. 2012. Theory of Statistics. Springer Series in Statistics.
Stuart. 2010. Inverse Problems: A Bayesian Perspective.” Acta Numerica.
Szabó, van der Vaart, and van Zanten. 2013. Frequentist Coverage of Adaptive Nonparametric Bayesian Credible Sets.” arXiv:1310.4489 [Math, Stat].
Xuan, Lu, and Zhang. 2020. A Survey on Bayesian Nonparametric Learning.” ACM Computing Surveys.
Zhou, Chen, Paisley, et al. 2009. Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations.” In Proceedings of the 22nd International Conference on Neural Information Processing Systems. NIPS’09.