## Useful stochastic processes

Dirichlet priors, other measure priors, Gaussian Process regression, reparameterisations etc. π

## Posterior updates in infinite dimensions

For now, this is just a bookmark to the general measure theoretic notation that unifies, in principle, the various Bayesian nonparametric methods. A textbook on general theory is Schervish (2012). Chapter 1 of Matthews (2017) is a compact introduction.

Particular applications are outlined in Matthews (2017) (GP regression) and Stuart (2010) (inverse problems).

A brief introduction the kind of measure-theoretic notation we need in the infinite-dimensional Hilbert space settings is in Alexanderian (2021), giving Bayesβ formula as \[ \frac{d \mu_{\text {post }}^{y}}{d \mu_{\text {pr }}} \propto \pi_{\text {like }}(\boldsymbol{y} \mid m), \] where the left hand side is the Radon-Nikodym derivative of \(\mu_{\text {post }}^{y}\) with respect to \(\mu_{\text {pr }}\).

They observe

Note that in the finite-dimensional setting the abstract form of the Bayesβ formula above can be reduced to the familiar form of Bayesβ formula in terms of PDFs. Specifically, working in finite-dimensions, with \(\mu_{\mathrm{pr}}\) and \(\mu_{\mathrm{post}}^{y}\) that are absolutely continuous with respect to the Lebesgue measure \(\lambda\), the prior and posterior measures admit Lebesgue densities \(\pi_{\mathrm{pr}}\) and \(\pi_{\text {post }}\), respectively. Then, we note \[ \pi_{\mathrm{post}}(m \mid \boldsymbol{y})=\frac{d \mu_{\mathrm{post}}^{y}}{d \lambda}(m)=\frac{d \mu_{\mathrm{post}}^{y}}{d \mu_{\mathrm{pr}}}(m) \frac{d \mu_{\mathrm{pr}}}{d \lambda}(m) \propto \pi_{\mathrm{like}}(\boldsymbol{y} \mid m) \pi_{\mathrm{pr}}(m) \]

## Bayesian consistency

Consistency turns out to be potentially tricky for functional models.
I am not an expert on consistency, but see Cox (1993) for some warnings about what can go wrong and Florens and Simoni (2016);Knapik, van der Vaart, and van Zanten (2011) for some remedies.
**tl;dr** posterior credible intervals arising from over-tight priors may never cover the frequentist estimate.
Further reading on this is in some classic refs
[Diaconis and Freedman (1986);Freedman (1999);KleijnMisspecification2006].

## References

*arXiv:2005.12998 [Math]*, January.

*SIAM Journal on Scientific Computing*35 (6): A2494β2523.

*Inverse Problems & Imaging*10 (4): 943.

*The Annals of Statistics*21 (2): 903β23.

*The Annals of Statistics*14 (1): 1β26.

*Econometric Theory*32 (1): 71β121.

*The Annals of Statistics*27 (4): 1119β41.

*The Annals of Statistics*34 (2): 837β77.

*The Annals of Statistics*39 (5).

*Communications for Statistical Applications and Methods*23 (6): 445β66.

*SIAM Journal on Scientific Computing*36 (4): A1525β55.

*Annual Review of Statistics and Its Application*3 (1): 211β31.

*Theory of Statistics*. Springer Series in Statistics. New York, NY: Springer Science & Business Media.

*Acta Numerica*19: 451β559.

*arXiv:1310.4489 [Math, Stat]*, October.

*ACM Computing Surveys*52 (1): 1β36.

*Proceedings of the 22nd International Conference on Neural Information Processing Systems*, 22:2295β2303. NIPSβ09. Red Hook, NY, USA: Curran Associates Inc.

## No comments yet. Why not leave one?