Bayesian nonparametric statistics

Updating more dimensions than datapoints

2016-05-30 — 2025-08-14

Wherein infinite-dimensional parameter spaces are invoked and posterior updates are expressed via Radon–Nikodym derivatives, Gaussian processes and Dirichlet measure priors are employed, and consistency pitfalls for functional models are noted.

Bayes
functional analysis
Gaussian
generative
how do science
Monte Carlo
nonparametric
statistics
stochastic processes
Figure 1: It is hard to explain what happens to the posterior in this case

“Nonparametric” Bayes refers to Bayesian inference in which the parameters are infinite-dimensional (I don’t like that term). There’s a connection to predictive Bayes that I’d like to understand better.

1 Useful stochastic processes

Figure 2: A map of popular processes used in Bayesian nonparametrics from Xuan, Lu, and Zhang (2020)

Dirichlet priors and other measure priors, Gaussian process regression, reparameterizations etc. 🏗

2 Posterior updates in infinite dimensions

For now, this is just a bookmark to the general measure-theoretic notation that, in principle, unifies the various Bayesian nonparametric methods. A textbook on the general theory is Schervish (2012). Chapter 1 of Matthews (2017) gives a compact introduction.

Particular applications are outlined in Matthews (2017) (see Gaussian process regression) and Stuart (2010) (see inverse problems).

A brief introduction to the measure-theoretic notation we need in infinite-dimensional Hilbert space settings is in Alexanderian (2021), which gives Bayes’ formula as \[ \frac{d \mu_{\text {post }}^{y}}{d \mu_{\text {pr }}} \propto \pi_{\text {like }}(\boldsymbol{y} \mid m), \] where the left-hand side is the Radon–Nikodym derivative of \(\mu_{\text {post }}^{y}\) with respect to \(\mu_{\text {pr }}\).

They observe

Note that in the finite-dimensional setting the abstract form of the Bayes’ formula above can be reduced to the familiar form of Bayes’ formula in terms of PDFs. Specifically, working in finite dimensions, with \(\mu_{\mathrm{pr}}\) and \(\mu_{\mathrm{post}}^{y}\) that are absolutely continuous with respect to the Lebesgue measure \(\lambda\), the prior and posterior measures admit Lebesgue densities \(\pi_{\mathrm{pr}}\) and \(\pi_{\text {post }}\), respectively. Then, we note \[ \pi_{\mathrm{post}}(m \mid \boldsymbol{y})=\frac{d \mu_{\mathrm{post}}^{y}}{d \lambda}(m)=\frac{d \mu_{\mathrm{post}}^{y}}{d \mu_{\mathrm{pr}}}(m) \frac{d \mu_{\mathrm{pr}}}{d \lambda}(m) \propto \pi_{\mathrm{like}}(\boldsymbol{y} \mid m) \pi_{\mathrm{pr}}(m) \]

3 Bayesian consistency

Consistency turns out to be tricky for functional models. I’m not an expert on consistency, but see Cox (1993) for some warnings about what can go wrong, and Florens and Simoni (2016); Knapik, van der Vaart, and van Zanten (2011) for some remedies. tl;dr: posterior credible intervals arising from over-tight priors may never cover the frequentist estimate. Further reading on this is in some classic refs: (Diaconis and Freedman 1986; Freedman 1999; Kleijn and van der Vaart 2006).

4 Incoming

5 References

Alexanderian. 2021. Optimal Experimental Design for Infinite-Dimensional Bayesian Inverse Problems Governed by PDEs: A Review.” arXiv:2005.12998 [Math].
Broderick, Wilson, and Jordan. 2018. Posteriors, Conjugacy, and Exponential Families for Completely Random Measures.” Bernoulli.
Bui-Thanh, Ghattas, Martin, et al. 2013. A Computational Framework for Infinite-Dimensional Bayesian Inverse Problems Part I: The Linearized Case, with Application to Global Seismic Inversion.” SIAM Journal on Scientific Computing.
Bui-Thanh, and Nguyen. 2016. FEM-Based Discretization-Invariant MCMC Methods for PDE-Constrained Bayesian Inverse Problems.” Inverse Problems & Imaging.
Cox. 1993. An Analysis of Bayesian Inference for Nonparametric Regression.” The Annals of Statistics.
Crane. 2016. The Ubiquitous Ewens Sampling Formula.” Statistical Science.
Diaconis, and Freedman. 1986. On the Consistency of Bayes Estimates.” The Annals of Statistics.
Ewens. 1972. The Sampling Theory of Selectively Neutral Alleles.” Theoretical Population Biology.
Florens, and Simoni. 2016. Regularizing Priors for Linear Inverse Problems.” Econometric Theory.
Freedman. 1999. Wald Lecture: On the Bernstein-von Mises Theorem with Infinite-Dimensional Parameters.” The Annals of Statistics.
Kleijn, and van der Vaart. 2006. Misspecification in Infinite-Dimensional Bayesian Statistics.” The Annals of Statistics.
Knapik, van der Vaart, and van Zanten. 2011. Bayesian Inverse Problems with Gaussian Priors.” The Annals of Statistics.
Knoblauch, Jewson, and Damoulas. 2019. Generalized Variational Inference: Three Arguments for Deriving New Posteriors.”
Lee, Hyun Keun, Kwon, and Kim. 2022. Statistical Inference as Green’s Functions.”
Lee, Hyungi, Yun, Nam, et al. 2023. Martingale Posterior Neural Processes.”
MacEachern. 2016. Nonparametric Bayesian Methods: A Gentle Introduction and Overview.” Communications for Statistical Applications and Methods.
Matthews. 2017. Scalable Gaussian Process Inference Using Variational Methods.”
Nguyen, Trungtin, Forbes, Arbel, et al. 2023. Bayesian Nonparametric Mixture of Experts for Inverse Problems.”
Nguyen, Tin D., Huggins, Masoero, et al. 2023. Independent Finite Approximations for Bayesian Nonparametric Inference.” Bayesian Analysis.
Orbanz. 2009. Functional Conjugacy in Parametric Bayesian Models.”
———. 2011. Conjugate Projective Limits.”
Orbanz, and Teh. n.d. “Bayesian Nonparametric Models.”
Petra, Martin, Stadler, et al. 2014. A Computational Framework for Infinite-Dimensional Bayesian Inverse Problems, Part II: Stochastic Newton MCMC with Application to Ice Sheet Flow Inverse Problems.” SIAM Journal on Scientific Computing.
Rousseau. 2016. On the Frequentist Properties of Bayesian Nonparametric Methods.” Annual Review of Statistics and Its Application.
Schervish. 2012. Theory of Statistics. Springer Series in Statistics.
Stuart. 2010. Inverse Problems: A Bayesian Perspective.” Acta Numerica.
Szabó, van der Vaart, and van Zanten. 2013. Frequentist Coverage of Adaptive Nonparametric Bayesian Credible Sets.” arXiv:1310.4489 [Math, Stat].
Tavaré. 2021. The Magical Ewens Sampling Formula.” Bulletin of the London Mathematical Society.
Xuan, Lu, and Zhang. 2020. A Survey on Bayesian Nonparametric Learning.” ACM Computing Surveys.
Zhou, Chen, Paisley, et al. 2009. Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations.” In Proceedings of the 22nd International Conference on Neural Information Processing Systems. NIPS’09.