Frequentist model selection is not the only type, but I know less about Bayesian model selection. What is model selection in a Bayesian context? Surely you donβt ever get some models with zero posterior probability? In my intro Bayesian classes I learned that one simply keeps all the models weighted by posterior likelihood when making predictions. But sometimes we wish to get rid of some models. When does this work, and when not? Typically this seems to be done by comparing model marginal evidence.

π

## Sparsity

Interesting special case: Bayesian sparsity.

## Cross-validation and Bayes

There is a relation between cross-validation and Bayes evidence, a.k.a. marginal likelihood - see (Claeskens and Hjort 2008; Fong and Holmes 2019).

## Evidence/marginal likelihood/type II maximum likelihood

## Incoming

John Mount on applied variable selection

We have also always felt a bit exposed in this, as feature selection

seemsunjustified in standard explanations of regression. Onefeelsthat if a coefficient were meant to be zero, the fitting procedure would have set it to zero. Under this misapprehension, stepping in and removing some variablesfeelsunjustified.Regardless of intuition or feelings, it is a fair question: is variable selection a natural justifiable part of modeling? Or is it something that is already done (therefore redundant). Or is it something that is not done for important reasons (such as avoiding damaging bias)?

In this note we will show that feature selection

isin fact an obvious justified step when using a sufficiently sophisticated model of regression. This note is long, as it defines so many tiny elementary steps. However this note ends with a big point: variable selectionisjustified. It naturally appears in the right variation of Bayesian Regression. Youshouldselect variables, using your preferred methodology. And youshouldnβtfeel bad about selecting variables.

## References

*Biometrika*103 (4): 955β69.

*Journal of the American Statistical Association*107 (500): 1610β24.

*Journal of Statistical Computation and Simulation*90 (14): 2499β2523.

*Biometrika*97 (2): 465β80.

*Model Selection*. Vol. 38. IMS Lecture Notes - Monograph Series. Beachwood, OH: Institute of Mathematical Statistics.

*Model Selection and Model Averaging*. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge ; New York: Cambridge University Press.

*The Annals of Applied Statistics*6 (4): 1971β97.

*Proceedings of the 32nd International Conference on Machine Learning*, 1015β24. PMLR.

*arXiv:1905.08737 [Stat]*, May.

*Sociological Methodology*25: 165β73.

*Statistica Sinica*7 (2): 339β73.

*Royal Society Open Science*9 (2): 211823.

*The Annals of Statistics*33 (2): 730β73.

*Journal of the American Statistical Association*99 (465): 279β90.

*Journal of the Royal Statistical Society. Series B (Methodological)*57 (1): 247β62.

*arXiv:1611.01241 [Stat]*, November.

*Frontiers in Applied Mathematics and Statistics*3.

*Network: Computation in Neural Systems*6 (3): 469β505.

*Neural Computation*11 (5): 1035β68.

*Journal of the American Statistical Association*89 (428): 1535β46.

*Bayesian Analysis*, January.

*arXiv:2109.03204 [Math, Stat]*, September.

*arXiv:1710.09146 [Math, Stat]*, October.

*Statistics and Computing*27 (3): 711β35.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*74 (2): 287β311.

*Sociological Methodology*25: 111β63.

*Journal of the American Statistical Association*113 (521): 431β44.

*Journal of the Korean Statistical Society*37 (1): 3β10.

*Statistics Surveys*6: 142β228.

*arXiv:1509.09169 [Stat]*, May.

## No comments yet. Why not leave one?