Quasi-gradients of discrete parameters
December 20, 2022 — March 19, 2024
calculus
classification
probabilistic algorithms
optimization
probability
statistics
Notes on taking gradients through functions that look like they have no gradients because their arguments are discrete. TBC.
Related: Gumbel-max, Polya-Gamma…
1 References
Arya, Schauer, Schäfer, et al. 2022. “Automatic Differentiation of Programs with Discrete Randomness.” In.
Grathwohl, Swersky, Hashemi, et al. 2021. “Oops I Took A Gradient: Scalable Sampling for Discrete Distributions.”
Prillo, and Eisenschlos. 2020. “SoftSort: A Continuous Relaxation for the Argsort Operator.”