Bayesian model selection by model evidence maximisation

Type II maximum likelihood, marginal maximum likelihood, Bayes Occam’s razor



See Bayes model selection for alternative approaches to model selection in Bayes. If we are not necessarily Bayesian we might consider minimum description length which is possibly more general?

TBC

References

Bishop, Christopher M. 2006. Pattern Recognition and Machine Learning. Information Science and Statistics. New York: Springer.
Filippone, Maurizio, and Raphael Engler. 2015. Enabling Scalable Stochastic Gradient-Based Inference for Gaussian Processes by Employing the Unbiased LInear System SolvEr (ULISSE).” In Proceedings of the 32nd International Conference on Machine Learning, 1015–24. PMLR.
Fong, Edwin, and Chris Holmes. 2019. On the Marginal Likelihood and Cross-Validation.” arXiv:1905.08737 [Stat], May.
Grünwald, Peter D. 2007. The Minimum Description Length Principle. Cambridge, Mass.: MIT Press.
Hansen, Mark H., and Bin Yu. 2001. Model Selection and the Principle of Minimum Description Length.” Journal of the American Statistical Association 96 (454): 746–74.
Immer, Alexander, Matthias Bauer, Vincent Fortuin, Gunnar Rätsch, and Khan Mohammad Emtiyaz. 2021. Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning.” In Proceedings of the 38th International Conference on Machine Learning, 4563–73. PMLR.
Jamil, Tahira, and Cajo J. F. ter Braak. 2012. Selection Properties of Type II Maximum Likelihood (Empirical Bayes) in Linear Models with Individual Variance Components for Predictors.” Pattern Recognition Letters 33 (9): 1205–12.
Kadane, Joseph B., and Nicole A. Lazar. 2004. Methods and Criteria for Model Selection.” Journal of the American Statistical Association 99 (465): 279–90.
Lotfi, Sanae, Pavel Izmailov, Gregory Benton, Micah Goldblum, and Andrew Gordon Wilson. 2022. “Bayesian Model Selection, the Marginal Likelihood, and Generalization.” In.
Mackay, David J. C. 1992. A Practical Bayesian Framework for Backpropagation Networks.” Neural Computation 4 (3): 448–72.
MacKay, David J. C. 1992. Bayesian Interpolation.” Neural Computation, model, 4 (3): 415–47.
MacKay, David JC. 1999. Comparison of Approximate Methods for Handling Hyperparameters.” Neural Computation 11 (5): 1035–68.
Murphy, Kevin P. 2012. Machine learning: a probabilistic perspective. 1 edition. Adaptive computation and machine learning series. Cambridge, MA: MIT Press.
———. 2022. Probabilistic Machine Learning: An Introduction. MIT Press.
Piironen, Juho, and Aki Vehtari. 2017. Comparison of Bayesian Predictive Methods for Model Selection.” Statistics and Computing 27 (3): 711–35.
Quiñonero-Candela, Joaquin, and Carl Edward Rasmussen. 2005. A Unifying View of Sparse Approximate Gaussian Process Regression.” Journal of Machine Learning Research 6 (Dec): 1939–59.
Rasmussen, Carl Edward, and Christopher K. I. Williams. 2006. Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. Cambridge, Mass: MIT Press.
Tran, Ba-Hien, Simone Rossi, Dimitrios Milios, Pietro Michiardi, Edwin V Bonilla, and Maurizio Filippone. 2021. Model Selection for Bayesian Autoencoders.” In Advances in Neural Information Processing Systems, 34:19730–42. Curran Associates, Inc.
Varin, Cristiano. 2008. On Composite Marginal Likelihoods.” Advances in Statistical Analysis 92 (1): 1–28.
Vehtari, Aki, and Janne Ojanen. 2012. A Survey of Bayesian Predictive Methods for Model Assessment, Selection and Comparison.” Statistics Surveys 6: 142–228.
Wiel, Mark A. van de, Dennis E. Te Beest, and Magnus M. Münch. 2019. Learning from a Lot: Empirical Bayes for High‐dimensional Model‐based Prediction.” Scandinavian Journal of Statistics, Theory and Applications 46 (1): 2–25.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.