Bayes NNs where only some weights are random and others are fixed. This raises various difficulties β how to you update a fixed parameter?
Is this even principled?
Try Sharma et al. (2022).
How to update a deterministic parameter?
From the perspective of Bayes inference, parameters we do not update have zero prior variance. And yet we do update them by SGD. What does that mean? How can we make that statistically well-posed?
Last layer
The most famous one. See Bayes last layer.
Probabilistic weight tying
possibly the same idea? Rafael Oliveira has referred me to Roth and Pernkopf (2020) for some ideas about this.
References
Daxberger, Erik, Eric Nalisnick, James U. Allingham, Javier Antoran, and Jose Miguel Hernandez-Lobato. 2021. βBayesian Deep Learning via Subnetwork Inference.β In Proceedings of the 38th International Conference on Machine Learning, 2510β21. PMLR.
Daxberger, Erik, Eric Nalisnick, James Urquhart Allingham, Javier Antoran, and Jose Miguel Hernandez-Lobato. 2020. βExpressive yet Tractable Bayesian Deep Learning via Subnetwork Inference.β In.
Dusenberry, Michael, Ghassen Jerfel, Yeming Wen, Yian Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, and Dustin Tran. 2020. βEfficient and Scalable Bayesian Neural Nets with Rank-1 Factors.β In Proceedings of the 37th International Conference on Machine Learning, 2782β92. PMLR.
Izmailov, Pavel, Wesley J. Maddox, Polina Kirichenko, Timur Garipov, Dmitry Vetrov, and Andrew Gordon Wilson. 2020. βSubspace Inference for Bayesian Deep Learning.β In Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, 1169β79. PMLR.
Ke, Xiongwen, and Yanan Fan. 2022. βOn the Optimization and Pruning for Bayesian Deep Learning.β arXiv.
Kowal, Daniel R. 2022. βBayesian Subset Selection and Variable Importance for Interpretable Prediction and Classification.β arXiv.
Roth, Wolfgang, and Franz Pernkopf. 2020. βBayesian Neural Networks with Weight Sharing Using Dirichlet Processes.β IEEE Transactions on Pattern Analysis and Machine Intelligence 42 (1): 246β52.
Sharma, Mrinank, Sebastian Farquhar, Eric Nalisnick, and Tom Rainforth. 2022. βDo Bayesian Neural Networks Need To Be Fully Stochastic?β arXiv.
Tran, Ba-Hien, Simone Rossi, Dimitrios Milios, and Maurizio Filippone. 2022. βAll You Need Is a Good Functional Prior for Bayesian Deep Learning.β Journal of Machine Learning Research 23 (74): 1β56.
Tran, M.-N., N. Nguyen, D. Nott, and R. Kohn. 2019. βBayesian Deep Net GLM and GLMM.β Journal of Computational and Graphical Statistics 29 (ja): 1β40.
No comments yet. Why not leave one?