Bishop, Christopher M. 2006. Pattern Recognition and Machine Learning. Information Science and Statistics. New York: Springer.
Cawley, G.C., and N.L.C. Talbot. 2005. “A Simple Trick for Constructing Bayesian Formulations of Sparse Kernel Learning Methods.” In Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005., 3:1425–30. Montreal, QC, Canada: IEEE.
Filippone, Maurizio, and Raphael Engler. 2015. “Enabling Scalable Stochastic Gradient-Based Inference for Gaussian Processes by Employing the Unbiased LInear System SolvEr (ULISSE).” In Proceedings of the 32nd International Conference on Machine Learning, 1015–24. PMLR.
Fong, Edwin, and Chris Holmes. 2019. “On the Marginal Likelihood and Cross-Validation.” arXiv:1905.08737 [Stat], May.
Grünwald, Peter D. 2007. The Minimum Description Length Principle. Cambridge, Mass.: MIT Press.
Hansen, Mark H., and Bin Yu. 2001. “Model Selection and the Principle of Minimum Description Length.” Journal of the American Statistical Association 96 (454): 746–74.
Immer, Alexander, Matthias Bauer, Vincent Fortuin, Gunnar Rätsch, and Khan Mohammad Emtiyaz. 2021. “Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning.” In Proceedings of the 38th International Conference on Machine Learning, 4563–73. PMLR.
Jamil, Tahira, and Cajo J. F. ter Braak. 2012. “Selection Properties of Type II Maximum Likelihood (Empirical Bayes) in Linear Models with Individual Variance Components for Predictors.” Pattern Recognition Letters 33 (9): 1205–12.
Kadane, Joseph B., and Nicole A. Lazar. 2004. “Methods and Criteria for Model Selection.” Journal of the American Statistical Association 99 (465): 279–90.
Lotfi, Sanae, Pavel Izmailov, Gregory Benton, Micah Goldblum, and Andrew Gordon Wilson. 2022. “Bayesian Model Selection, the Marginal Likelihood, and Generalization.” In.
Mackay, David J. C. 1992. “A Practical Bayesian Framework for Backpropagation Networks.” Neural Computation 4 (3): 448–72.
MacKay, David J. C. 1992. “Bayesian Interpolation.” Neural Computation, model, 4 (3): 415–47.
MacKay, David JC. 1999. “Comparison of Approximate Methods for Handling Hyperparameters.” Neural Computation 11 (5): 1035–68.
Murphy, Kevin P. 2012. Machine learning: a probabilistic perspective. 1 edition. Adaptive computation and machine learning series. Cambridge, MA: MIT Press.
———. 2022. Probabilistic Machine Learning: An Introduction. Adaptive Computation and Machine Learning Series. Cambridge, Massachusetts: The MIT Press.
Piironen, Juho, and Aki Vehtari. 2017. “Comparison of Bayesian Predictive Methods for Model Selection.” Statistics and Computing 27 (3): 711–35.
Quiñonero-Candela, Joaquin, and Carl Edward Rasmussen. 2005. “A Unifying View of Sparse Approximate Gaussian Process Regression.” Journal of Machine Learning Research 6 (Dec): 1939–59.
Rasmussen, Carl Edward, and Christopher K. I. Williams. 2006. Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. Cambridge, Mass: MIT Press.
Tran, Ba-Hien, Simone Rossi, Dimitrios Milios, Pietro Michiardi, Edwin V Bonilla, and Maurizio Filippone. 2021. “Model Selection for Bayesian Autoencoders.” In Advances in Neural Information Processing Systems, 34:19730–42. Curran Associates, Inc.
Varin, Cristiano. 2008. “On Composite Marginal Likelihoods.” Advances in Statistical Analysis 92 (1): 1–28.
Vehtari, Aki, and Janne Ojanen. 2012. “A Survey of Bayesian Predictive Methods for Model Assessment, Selection and Comparison.” Statistics Surveys 6: 142–228.
Wiel, Mark A. van de, Dennis E. Te Beest, and Magnus M. Münch. 2019. “Learning from a Lot: Empirical Bayes for High‐dimensional Model‐based Prediction.” Scandinavian Journal of Statistics, Theory and Applications 46 (1): 2–25.