Frequentist consistency of Bayesian methods

TFW two flawed methods for understanding the world agree with at least each other



You want to use some tasty tool, such as a hierarchical model without anyone getting cross at you for apostasy by doing it in the wrong discipline? Why not use whatever estimator works, and then show that it works on both frequentist and Bayesian grounds?

Shalizi’s overview

There is a basic result, due to Doob, which essentially says that the Bayesian learner is consistent, except on a set of data of prior probability zero. That is, the Bayesian is subjectively certain they will converge on the truth. This is not as reassuring as one might wish, and showing Bayesian consistency under the true distribution is harder. In fact, it usually involves assumptions under which non-Bayes procedures will also converge. […]

Concentration of the posterior around the truth is only a preliminary. One would also want to know that, say, the posterior mean converges, or even better that the predictive distribution converges. For many finite-dimensional problems, what’s called the β€œBernstein-von Mises theorem” basically says that the posterior mean and the maximum likelihood estimate converge, so if one works the other will too. This breaks down for infinite-dimensional problems.

(Bernardo and de Valencia 2006), in the context of β€œObjective Bayes”, argues for frequentist methods as necessary.

Bayesian Statistics is typically taught, if at all, after a prior exposure to frequentist statistics. It is argued that it may be appropriate to reverse this procedure. Indeed, the emergence of powerful objective Bayesian methods (where the result, as in frequentist statistics, only depends on the assumed model and the observed data), provides a new unifying perspective on most established methods, and may be used in situations (e.g. hierarchical structures) where frequentist methods cannot. On the other hand, frequentist procedures provide mechanisms to evaluate and calibrate any procedure. Hence, it may be the right time to consider an integrated approach to mathematical statistics, where objective Bayesian methods are first used to provide the building elements, and frequentist methods are then used to provide the necessary evaluation.

Nonparametric

Bayes nonparametrics sound like they might avoid the problem of failing to include the true model but they can also fail in weird ways.

Variational

I am not sure how this works. But it is important (Wang and Blei 2017).

References

Aaronson, Scott. 2005. β€œThe Complexity of Agreement.” In Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing, 634. ACM Press.
Advani, Madhu, and Surya Ganguli. 2016. β€œAn Equivalence Between High Dimensional Bayes Optimal Inference and M-Estimation.” In Advances In Neural Information Processing Systems.
Aumann, Robert J. 1976. β€œAgreeing to Disagree.” The Annals of Statistics 4 (6): 1236–39.
Bayarri, M. J., and J. O. Berger. 2004. β€œThe Interplay of Bayesian and Frequentist Analysis.” Statistical Science 19 (1): 58–80.
Bernardo, Jose M, and Universitat de Valencia. 2006. β€œA Bayesian Mathematical Statistics Primer,” 6.
Cox, Dennis D. 1993. β€œAn Analysis of Bayesian Inference for Nonparametric Regression.” The Annals of Statistics 21 (2): 903–23.
Diaconis, Persi, and David Freedman. 1986. β€œOn the Consistency of Bayes Estimates.” The Annals of Statistics 14 (1): 1–26.
Doob, J. L. 1949. β€œApplication of the Theory of Martingales.” In Le Calcul Des ProbabilitΓ©s Et Ses Applications, 23–27. Colloques Internationaux Du Centre National de La Recherche Scientifique, No. 13. Centre National de la Recherche Scientifique, Paris.
Efron, Bradley. 2012. β€œBayesian Inference and the Parametric Bootstrap.” The Annals of Applied Statistics 6 (4): 1971–97.
β€”β€”β€”. 2015. β€œFrequentist Accuracy of Bayesian Estimates.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 77 (3): 617–46.
Florens, Jean-Pierre, and Anna Simoni. 2016. β€œRegularizing Priors for Linear Inverse Problems.” Econometric Theory 32 (1): 71–121.
Fong, Edwin, and Chris Holmes. 2019. β€œOn the Marginal Likelihood and Cross-Validation.” arXiv:1905.08737 [Stat], May.
Freedman, David. 1999. β€œWald Lecture: On the Bernstein-von Mises Theorem with Infinite-Dimensional Parameters.” The Annals of Statistics 27 (4): 1119–41.
Gelman, Andrew. 2008. β€œRejoinder.” Bayesian Analysis 3 (3).
Gelman, Andrew, Aleks Jakulin, Maria Grazia Pittau, and Yu-Sung Su. 2008. β€œA Weakly Informative Default Prior Distribution for Logistic and Other Regression Models.” The Annals of Applied Statistics 2 (4): 1360–83.
Kleijn, B. J. K. 2021. β€œFrequentist Validity of Bayesian Limits.” The Annals of Statistics 49 (1): 182–202.
Kleijn, B. J. K., and A. W. van der Vaart. 2006. β€œMisspecification in Infinite-Dimensional Bayesian Statistics.” The Annals of Statistics 34 (2): 837–77.
Knapik, B. T., A. W. van der Vaart, and J. H. van Zanten. 2011. β€œBayesian Inverse Problems with Gaussian Priors.” The Annals of Statistics 39 (5).
Lee, Hyun Keun, Chulan Kwon, and Yong Woon Kim. 2022. β€œStatistical Inference as Green’s Functions.” arXiv.
Lele, S. R., B. Dennis, and F. Lutscher. 2007. β€œData Cloning: Easy Maximum Likelihood Estimation for Complex Ecological Models Using Bayesian Markov Chain Monte Carlo Methods.” Ecology Letters 10 (7): 551.
Lele, Subhash R., Khurram Nadeem, and Byron Schmuland. 2010. β€œEstimability and Likelihood Inference for Generalized Linear Mixed Models Using Data Cloning.” Journal of the American Statistical Association 105 (492): 1617–25.
Nickl, Richard. 2014. β€œDiscussion of: β€˜Frequentist Coverage of Adaptive Nonparametric Bayesian Credible Sets’.” arXiv:1410.7600 [Math, Stat], October.
Norton, Robert M. 1984. β€œThe Double Exponential Distribution: Using Calculus to Find a Maximum Likelihood Estimator.” The American Statistician 38 (2): 135–36.
Rousseau, Judith. 2016. β€œOn the Frequentist Properties of Bayesian Nonparametric Methods.” Annual Review of Statistics and Its Application 3 (1): 211–31.
Shalizi, Cosma Rohilla. 2009. β€œDynamics of Bayesian Updating with Dependent Data and Misspecified Models.” Electronic Journal of Statistics 3: 1039–74.
Sims, C. 2010. β€œUnderstanding Non-Bayesians.” Unpublished Chapter, Department of Economics, Princeton University.
SzabΓ³, Botond, Aad van der Vaart, and Harry van Zanten. 2013. β€œFrequentist Coverage of Adaptive Nonparametric Bayesian Credible Sets.” arXiv:1310.4489 [Math, Stat], October.
Tibshirani, Robert. 1996. β€œRegression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society. Series B (Methodological) 58 (1): 267–88.
Valpine, Perry de. 2011. β€œFrequentist Analysis of Hierarchical Models for Population Dynamics and Demographic Data.” Journal of Ornithology 152 (2): 393–408.
Wang, Yixin, and David M. Blei. 2017. β€œFrequentist Consistency of Variational Bayes.” arXiv:1705.03439 [Cs, Math, Stat], May.
Wasserman, Larry. 2011. β€œFrasian Inference.” Statistical Science 26 (3): 322–25.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.