Gradients and message-passing
Cleaving reality at the joint, then summing it at the marginal
November 25, 2014 — January 12, 2023
Bayes-by-backprop meets variational message-passing meets the chain rule.
1 Automatic differentiation as message-passing
This is a well-known informal bit of lore in the field, but apparently not well-documented?
The first reference I can find is Eaton (2022), which is amazingly late.
TBC
2 Stochastic variational message passing
Akbayrak (2023):
Stochastic approximation methods for variational inference have recently gained popularity in the probabilistic programming community since these methods are amenable to automation and allow online, scalable, and universal approximate Bayesian inference. Unfortunately, common Probabilistic Programming Libraries (PPLs) with stochastic approximation engines lack the efficiency of message passingbased inference algorithms with deterministic update rules such as Belief Propagation (BP) and Variational Message Passing (VMP). Still, Stochastic Variational Inference (SVI) and Conjugate-Computation Variational Inference (CVI) provide principled methods to integrate fast deterministic inference techniques with broadly applicable stochastic approximate inference. Unfortunately, implementation of SVI and CVI necessitates manually driven variational update rules, which do not yet exist in most PPLs. In this chapter, for the exponential family of distributions, we cast SVI and CVI explicitly in a message passing-based inference context. We also demonstrate how to go beyond exponential family of distributions by using raw stochastic gradient descent for the minimization of the free energy. We provide an implementation for SVI and CVI in ForneyLab, which is an automated message passing-based probabilistic programming package in the open source Julia language. Through a number of experiments, we demonstrate how SVI and CVI extends the automated inference capabilities of message passing-based probabilistic programming.