Bayes neural nets via subsetting weights

Bayes NNs where only some weights are random and others are fixed. This raises various difficulties β€” how to you update a fixed parameter?

Is this even principled?

Try Sharma et al. (2022).

How to update a deterministic parameter?

From the perspective of Bayes inference, parameters we do not update have zero prior variance. And yet we do update them by SGD. What does that mean? How can we make that statistically well-posed?

Last layer

The most famous one. See Bayes last layer.

Probabilistic weight tying

possibly the same idea? Rafael Oliveira has referred me to Roth and Pernkopf (2020) for some ideas about this.


