# Bayesian model selection

August 20, 2017 — July 22, 2019

Bayes
information
model selection
statistics

Frequentist model selection is not the only type, but I know less about Bayesian model selection. What is model selection in a Bayesian context? Surely you don’t ever get some models with zero posterior probability? In my intro Bayesian classes I learned that one simply keeps all the models weighted by posterior likelihood when making predictions. But sometimes we wish to get rid of some models. When does this work, and when not? Typically this seems to be done by comparing model marginal evidence.

🏗

## 1 Sparsity

Interesting special case: Bayesian sparsity.

## 2 Cross-validation and Bayes

There is a relation between cross-validation and Bayes evidence, a.k.a. marginal likelihood - see .

## 4 Incoming

John Mount on applied variable selection

We have also always felt a bit exposed in this, as feature selection seems unjustified in standard explanations of regression. One feels that if a coefficient were meant to be zero, the fitting procedure would have set it to zero. Under this misapprehension, stepping in and removing some variables feels unjustified.

Regardless of intuition or feelings, it is a fair question: is variable selection a natural justifiable part of modeling? Or is it something that is already done (therefore redundant). Or is it something that is not done for important reasons (such as avoiding damaging bias)?

In this note we will show that feature selection is in fact an obvious justified step when using a sufficiently sophisticated model of regression. This note is long, as it defines so many tiny elementary steps. However this note ends with a big point: variable selection is justified. It naturally appears in the right variation of Bayesian Regression. You should select variables, using your preferred methodology. And you shouldn’t feel bad about selecting variables.

## 5 References

Bhadra, Datta, Polson, et al. 2016. Biometrika.
Bondell, and Reich. 2012. Journal of the American Statistical Association.
Bürkner, Gabry, and Vehtari. 2020. Journal of Statistical Computation and Simulation.
Carvalho, Polson, and Scott. 2010. Biometrika.
Castillo, Schmidt-Hieber, and van der Vaart. 2015. The Annals of Statistics.
Chipman, George, McCulloch, et al. 2001. In Model Selection. IMS Lecture Notes - Monograph Series.
Claeskens, and Hjort. 2008. Model Selection and Model Averaging. Cambridge Series in Statistical and Probabilistic Mathematics.
Efron. 2012. The Annals of Applied Statistics.
Filippone, and Engler. 2015. In Proceedings of the 32nd International Conference on Machine Learning.
Fong, and Holmes. 2019. arXiv:1905.08737 [Stat].
Gelman, and Rubin. 1995. Sociological Methodology.
George, and McCulloch. 1997. Statistica Sinica.
Hirsh, Barajas-Solano, and Kutz. 2022. Royal Society Open Science.
Ishwaran, and Rao. 2005. The Annals of Statistics.
Kadane, and Lazar. 2004. Journal of the American Statistical Association.
Laud, and Ibrahim. 1995. Journal of the Royal Statistical Society. Series B (Methodological).
Li, and Dunson. 2016. arXiv:1611.01241 [Stat].
Lorch, Rothfuss, Schölkopf, et al. 2021. In.
Mackay. 1995. Network: Computation in Neural Systems.
MacKay. 1999. Neural Computation.
Madigan, and Raftery. 1994. Journal of the American Statistical Association.
Navarro. 2019. Computational Brain & Behavior.
Ohn, and Kim. 2021. Bayesian Analysis.
Ohn, and Lin. 2021. arXiv:2109.03204 [Math, Stat].
Ormerod, Stewart, Yu, et al. 2017. arXiv:1710.09146 [Math, Stat].
Piironen, and Vehtari. 2017. Statistics and Computing.
Polson, and Scott. 2012. Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Raftery. 1995. Sociological Methodology.
Ročková, and George. 2018. Journal of the American Statistical Association.
Schmidt, and Makalic. 2020.
Simchoni, and Rosset. 2023.
Stein. 2008. Journal of the Korean Statistical Society.
Tang, Xu, Ghosh, et al. 2016.
van der Linden, and Chryst. 2017. Frontiers in Applied Mathematics and Statistics.
van Wieringen. 2021. arXiv:1509.09169 [Stat].
Vehtari, and Ojanen. 2012. Statistics Surveys.
Xu, Schmidt, Makalic, et al. 2017.
Zanella, and Roberts. 2019. Journal of the Royal Statistical Society Series B: Statistical Methodology.