Gibbs posteriors

Bayes-like inference without probability models

September 26, 2024 — September 26, 2024

Bayes
estimator distribution
functional analysis
Markov processes
Monte Carlo
neural nets
optimization
probabilistic algorithms
probability
SDEs
stochastic processes
Figure 1

Nothing to do with a Gibbs sampler or a Gibbs distribution.

Syring (2018):

Bayesian inference is, by far, the most well-known statistical method for updating beliefs about a population feature of interest in light of new data. Current beliefs, characterized by a probability distribution called a prior, are updated by combining with data, which is modeled as a random draw from another probability distribution. The Bayesian framework, therefore, depends heavily on the choices of model distributions for prior and data, and it is the latter that is of particular concern in this dissertation. Often, as will be shown in various examples, it is particularly difficult to make a good choice of data model: a bad choice may lead to misspecification and inconsistency of the posterior distribution, or may introduce nuisance parameters, increasing computational burden and complicating the choice of prior. Some particular statistical problems that may give Bayesians pause are classification and quantile regression. In these two problems a mathematical function called a loss function serves as the natural connection between the data and the population feature. Statistical inference based on loss functions can avoid having to specify a probability model for the data and parameter, which may be incorrect. Bayes’ Theorem cannot reconcile a posterior update using anything other than a probability model for data, so alternative methods are needed, besides Bayes, in order to take advantage of loss functions in these types of problems.

Gibbs posteriors, like Bayes posteriors, incorporate prior information and new data via an updating formula. However, the Gibbs posterior does not require modeling the data with a probability model as in Bayes; rather, data and parameter may be linked by a more general function, like the loss functions mentioned above. The Gibbs approach offers many potential benefits including robustness when the data distribution is not known and a natural avoidance of nuisance parameters, but Gibbs posteriors are not common throughout statistics literature. In an effort to raise awareness of Gibbs posteriors, this dissertation both develops new theoretical foundations and presents numerous examples highlighting the usefulness of Gibbs posteriors in statistical applications.

Two new asymptotic results for Gibbs posteriors are contributed. The main conclusion of the first result is that Gibbs posteriors have similar asymptotic behavior to a class of statistical estimators called M-estimators in a wide range of problems. The main advantage of the Gibbs posterior, then, is its ability to incorporate prior information.

There is a compact and clear explanation in Martin and Syring (2022).

Question: Is this the same as Bissiri, Holmes, and Walker (2016)? The use of a loss function instead of a likelihood sounds like a shared property of the two.

cf inference without KL.

1 References

Baek, Aquino, and Mukherjee. 2023. Generalized Bayes Approach to Inverse Problems with Model Misspecification.” Inverse Problems.
Bhattacharya, and Martin. 2022. Gibbs Posterior Inference on Multivariate Quantiles.” Journal of Statistical Planning and Inference.
Bissiri, Holmes, and Walker. 2016. A General Framework for Updating Belief Distributions.” Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Bochkina. 2023. Bernstein–von Mises Theorem and Misspecified Models: A Review.” In Foundations of Modern Statistics. Springer Proceedings in Mathematics & Statistics.
Catoni. 2007. PAC-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning.” IMS Lecture Notes Monograph Series.
Dellaporta, Knoblauch, Damoulas, et al. 2022. Robust Bayesian Inference for Simulator-Based Models via the MMD Posterior Bootstrap.” arXiv:2202.04744 [Cs, Stat].
Grendár, and Judge. 2012. Not All Empirical Divergence Minimizing Statistical Methods Are Created Equal? AIP Conference Proceedings.
Grünwald. 2023. The e-Posterior.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.
Knoblauch, Jewson, and Damoulas. 2019. Generalized Variational Inference: Three Arguments for Deriving New Posteriors.”
———. 2022. “An Optimization-Centric View on Bayes’ Rule: Reviewing and Generalizing Variational Inference.” Journal of Machine Learning Research.
Martin, and Syring. 2022. Direct Gibbs Posterior Inference on Risk Minimizers: Construction, Concentration, and Calibration.”
Masegosa. 2020. Learning Under Model Misspecification: Applications to Variational and Ensemble Methods.” In Proceedings of the 34th International Conference on Neural Information Processing Systems. NIPS’20.
Matsubara, Knoblauch, Briol, et al. 2022. Robust Generalised Bayesian Inference for Intractable Likelihoods.” Journal of the Royal Statistical Society Series B: Statistical Methodology.
Schmon, Cannon, and Knoblauch. 2021. Generalized Posteriors in Approximate Bayesian Computation.” arXiv:2011.08644 [Stat].
Syring. 2018. Gibbs Posterior Distributions: New Theory and Applications.”
Walker. 2013. Bayesian Inference with Misspecified Models.” Journal of Statistical Planning and Inference.
Wang, Yixin, and Blei. 2019. Variational Bayes Under Model Misspecification.” In Advances in Neural Information Processing Systems.
Wang, Zhe, and Martin. 2021. Gibbs Posterior Inference on a Levy Density Under Discrete Sampling.”
Watson, and Holmes. 2016. Approximate Models and Robust Decisions.” Statistical Science.