Bayes neural nets via subsetting weights



Bayes NNs where only some weights are random and others are fixed. This raises various difficulties β€” how to you update a fixed parameter?

Is this even principled?

Try Sharma et al. (2022).

How to update a deterministic parameter?

From the perspective of Bayes inference, parameters we do not update have zero prior variance. And yet we do update them by SGD. What does that mean? How can we make that statistically well-posed?

Last layer

The most famous one. See Bayes last layer.

Probabilistic weight tying

possibly the same idea? Rafael Oliveira has referred me to Roth and Pernkopf (2020) for some ideas about this.

References

Daxberger, Erik, Eric Nalisnick, James U. Allingham, Javier Antoran, and Jose Miguel Hernandez-Lobato. 2021. β€œBayesian Deep Learning via Subnetwork Inference.” In Proceedings of the 38th International Conference on Machine Learning, 2510–21. PMLR.
Daxberger, Erik, Eric Nalisnick, James Urquhart Allingham, Javier Antoran, and Jose Miguel Hernandez-Lobato. 2020. β€œExpressive yet Tractable Bayesian Deep Learning via Subnetwork Inference.” In.
Dusenberry, Michael, Ghassen Jerfel, Yeming Wen, Yian Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, and Dustin Tran. 2020. β€œEfficient and Scalable Bayesian Neural Nets with Rank-1 Factors.” In Proceedings of the 37th International Conference on Machine Learning, 2782–92. PMLR.
Izmailov, Pavel, Wesley J. Maddox, Polina Kirichenko, Timur Garipov, Dmitry Vetrov, and Andrew Gordon Wilson. 2020. β€œSubspace Inference for Bayesian Deep Learning.” In Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, 1169–79. PMLR.
Ke, Xiongwen, and Yanan Fan. 2022. β€œOn the Optimization and Pruning for Bayesian Deep Learning.” arXiv.
Kowal, Daniel R. 2022. β€œBayesian Subset Selection and Variable Importance for Interpretable Prediction and Classification.” arXiv.
Roth, Wolfgang, and Franz Pernkopf. 2020. β€œBayesian Neural Networks with Weight Sharing Using Dirichlet Processes.” IEEE Transactions on Pattern Analysis and Machine Intelligence 42 (1): 246–52.
Sharma, Mrinank, Sebastian Farquhar, Eric Nalisnick, and Tom Rainforth. 2022. β€œDo Bayesian Neural Networks Need To Be Fully Stochastic?” arXiv.
Tran, Ba-Hien, Simone Rossi, Dimitrios Milios, and Maurizio Filippone. 2022. β€œAll You Need Is a Good Functional Prior for Bayesian Deep Learning.” Journal of Machine Learning Research 23 (74): 1–56.
Tran, M.-N., N. Nguyen, D. Nott, and R. Kohn. 2019. β€œBayesian Deep Net GLM and GLMM.” Journal of Computational and Graphical Statistics 29 (ja): 1–40.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.