Bayesian nonparametric statistics

Updating more dimensions than datapoints

2016-05-30 — 2022-04-07

Bayes

functional analysis

Gaussian

generative

how do science

Monte Carlo

nonparametric

statistics

stochastic processes

Suspiciously similar content

Figure 1: It is hard to explain what happens to the posterior in this case

1 Useful stochastic processes

Figure 2: A map of popular processes used in Bayesian nonparametrics from Xuan, Lu, and Zhang (2020)

Dirichlet priors, other measure priors, Gaussian Process regression, reparameterisations etc. 🏗 There is a close connection to Bayes predictives.

2 Posterior updates in infinite dimensions

For now, this is just a bookmark to the general measure-theoretic notation that unifies, in principle, the various Bayesian nonparametric methods. A textbook on general theory is Schervish (2012). Chapter 1 of Matthews (2017) is a compact introduction.

Particular applications are outlined in Matthews (2017) (GP regression) and Stuart (2010) (inverse problems).

A brief introduction to the kind of measure-theoretic notation we need in the infinite-dimensional Hilbert space settings is in Alexanderian (2021), giving Bayes’ formula as $\frac{d μ_{post}^{y}}{d μ_{pr}} \propto π_{like} (y ∣ m),$ where the left-hand side is the Radon-Nikodym derivative of $μ_{post}^{y}$ with respect to $μ_{pr}$ .

They observe

Note that in the finite-dimensional setting the abstract form of the Bayes’ formula above can be reduced to the familiar form of Bayes’ formula in terms of PDFs. Specifically, working in finite dimensions, with $μ_{pr}$ and $μ_{post}^{y}$ that are absolutely continuous with respect to the Lebesgue measure $λ$ , the prior and posterior measures admit Lebesgue densities $π_{pr}$ and $π_{post}$ , respectively. Then, we note $π_{post} (m ∣ y) = \frac{d μ_{post}^{y}}{d λ} (m) = \frac{d μ_{post}^{y}}{d μ_{pr}} (m) \frac{d μ_{pr}}{d λ} (m) \propto π_{like} (y ∣ m) π_{pr} (m)$

3 Bayesian consistency

Consistency turns out to be potentially tricky for functional models. I am not an expert on consistency, but see Cox (1993) for some warnings about what can go wrong and Florens and Simoni (2016);Knapik, van der Vaart, and van Zanten (2011) for some remedies. tl;dr posterior credible intervals arising from over-tight priors may never cover the frequentist estimate. Further reading on this is in some classic refs (Diaconis and Freedman 1986; Freedman 1999; Kleijn and van der Vaart 2006).

4 Incoming

5 References

Alexanderian. 2021. “Optimal Experimental Design for Infinite-Dimensional Bayesian Inverse Problems Governed by PDEs: A Review.” arXiv:2005.12998 [Math].

Broderick, Wilson, and Jordan. 2018. “Posteriors, Conjugacy, and Exponential Families for Completely Random Measures.” Bernoulli.

Bui-Thanh, Ghattas, Martin, et al. 2013. “A Computational Framework for Infinite-Dimensional Bayesian Inverse Problems Part I: The Linearized Case, with Application to Global Seismic Inversion.” SIAM Journal on Scientific Computing.

Bui-Thanh, and Nguyen. 2016. “FEM-Based Discretization-Invariant MCMC Methods for PDE-Constrained Bayesian Inverse Problems.” Inverse Problems & Imaging.

Cox. 1993. “An Analysis of Bayesian Inference for Nonparametric Regression.” The Annals of Statistics.

Crane. 2016. “The Ubiquitous Ewens Sampling Formula.” Statistical Science.

Diaconis, and Freedman. 1986. “On the Consistency of Bayes Estimates.” The Annals of Statistics.

Ewens. 1972. “The Sampling Theory of Selectively Neutral Alleles.” Theoretical Population Biology.

Florens, and Simoni. 2016. “Regularizing Priors for Linear Inverse Problems.” Econometric Theory.

Freedman. 1999. “Wald Lecture: On the Bernstein-von Mises Theorem with Infinite-Dimensional Parameters.” The Annals of Statistics.

Kleijn, and van der Vaart. 2006. “Misspecification in Infinite-Dimensional Bayesian Statistics.” The Annals of Statistics.

Knapik, van der Vaart, and van Zanten. 2011. “Bayesian Inverse Problems with Gaussian Priors.” The Annals of Statistics.

Knoblauch, Jewson, and Damoulas. 2019. “Generalized Variational Inference: Three Arguments for Deriving New Posteriors.”

Lee, Hyun Keun, Kwon, and Kim. 2022. “Statistical Inference as Green’s Functions.”

Lee, Hyungi, Yun, Nam, et al. 2023. “Martingale Posterior Neural Processes.”

MacEachern. 2016. “Nonparametric Bayesian Methods: A Gentle Introduction and Overview.” Communications for Statistical Applications and Methods.

Matthews. 2017. “Scalable Gaussian Process Inference Using Variational Methods.”

Nguyen, Trungtin, Forbes, Arbel, et al. 2023. “Bayesian Nonparametric Mixture of Experts for Inverse Problems.”

Nguyen, Tin D., Huggins, Masoero, et al. 2023. “Independent Finite Approximations for Bayesian Nonparametric Inference.” Bayesian Analysis.

Orbanz. 2009. “Functional Conjugacy in Parametric Bayesian Models.”

———. 2011. “Conjugate Projective Limits.”

Orbanz, and Teh. n.d. “Bayesian Nonparametric Models.”

Petra, Martin, Stadler, et al. 2014. “A Computational Framework for Infinite-Dimensional Bayesian Inverse Problems, Part II: Stochastic Newton MCMC with Application to Ice Sheet Flow Inverse Problems.” SIAM Journal on Scientific Computing.

Rousseau. 2016. “On the Frequentist Properties of Bayesian Nonparametric Methods.” Annual Review of Statistics and Its Application.

Schervish. 2012. Theory of Statistics. Springer Series in Statistics.

Stuart. 2010. “Inverse Problems: A Bayesian Perspective.” Acta Numerica.

Szabó, van der Vaart, and van Zanten. 2013. “Frequentist Coverage of Adaptive Nonparametric Bayesian Credible Sets.” arXiv:1310.4489 [Math, Stat].

Tavaré. 2021. “The Magical Ewens Sampling Formula.” Bulletin of the London Mathematical Society.

Xuan, Lu, and Zhang. 2020. “A Survey on Bayesian Nonparametric Learning.” ACM Computing Surveys.

Zhou, Chen, Paisley, et al. 2009. “Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations.” In Proceedings of the 22nd International Conference on Neural Information Processing Systems. NIPS’09.