# Bayesian model selection by model evidence maximisation

Type II maximum likelihood, marginal maximum likelihood, Bayes Occam’s razor

August 20, 2017 — December 22, 2022

See Bayes model selection for alternative approaches to model selection in Bayes. If we are not necessarily Bayesian we might consider minimum description length which is possibly more general?

TBC

## 1 Incoming

## 2 References

Bishop. 2006.

*Pattern Recognition and Machine Learning*. Information Science and Statistics.
Cawley, and Talbot. 2005. “A Simple Trick for Constructing Bayesian Formulations of Sparse Kernel Learning Methods.” In

*Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005.*
Filippone, and Engler. 2015. “Enabling Scalable Stochastic Gradient-Based Inference for Gaussian Processes by Employing the Unbiased LInear System SolvEr (ULISSE).” In

*Proceedings of the 32nd International Conference on Machine Learning*.
Fong, and Holmes. 2019. “On the Marginal Likelihood and Cross-Validation.”

*arXiv:1905.08737 [Stat]*.
Grünwald. 2007.

*The Minimum Description Length Principle*.
Hansen, and Yu. 2001. “Model Selection and the Principle of Minimum Description Length.”

*Journal of the American Statistical Association*.
Immer, Bauer, Fortuin, et al. 2021. “Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning.” In

*Proceedings of the 38th International Conference on Machine Learning*.
Jamil, and ter Braak. 2012. “Selection Properties of Type II Maximum Likelihood (Empirical Bayes) in Linear Models with Individual Variance Components for Predictors.”

*Pattern Recognition Letters*.
Kadane, and Lazar. 2004. “Methods and Criteria for Model Selection.”

*Journal of the American Statistical Association*.
Lotfi, Izmailov, Benton, et al. 2022. “Bayesian Model Selection, the Marginal Likelihood, and Generalization.” In.

Mackay. 1992. “A Practical Bayesian Framework for Backpropagation Networks.”

*Neural Computation*.
MacKay, David J. C. 1992. “Bayesian Interpolation.”

*Neural Computation*, model,.
MacKay, David JC. 1999. “Comparison of Approximate Methods for Handling Hyperparameters.”

*Neural Computation*.
Murphy. 2012.

*Machine learning: a probabilistic perspective*. Adaptive computation and machine learning series.
———. 2022.

*Probabilistic Machine Learning: An Introduction*. Adaptive Computation and Machine Learning Series.
Piironen, and Vehtari. 2017. “Comparison of Bayesian Predictive Methods for Model Selection.”

*Statistics and Computing*.
Quiñonero-Candela, and Rasmussen. 2005. “A Unifying View of Sparse Approximate Gaussian Process Regression.”

*Journal of Machine Learning Research*.
Rasmussen, and Williams. 2006.

*Gaussian Processes for Machine Learning*. Adaptive Computation and Machine Learning.
Tran, Rossi, Milios, et al. 2021. “Model Selection for Bayesian Autoencoders.” In

*Advances in Neural Information Processing Systems*.
van de Wiel, Te Beest, and Münch. 2019. “Learning from a Lot: Empirical Bayes for High‐dimensional Model‐based Prediction.”

*Scandinavian Journal of Statistics, Theory and Applications*.
Varin. 2008. “On Composite Marginal Likelihoods.”

*Advances in Statistical Analysis*.
Vehtari, and Ojanen. 2012. “A Survey of Bayesian Predictive Methods for Model Assessment, Selection and Comparison.”

*Statistics Surveys*.