# Neural denoising diffusion models

Denoising diffusion probabilistic models (DDPMs), score-based generative models, generative diffusion processes, neural energy models…

November 11, 2021 — April 22, 2024

approximation
Bayes
generative
Monte Carlo
neural nets
optimization
probabilistic algorithms
probability
statistics

Placeholder.

AFAICS, generative models using score-matching to learn and Langevin MCMC to sample. There are various tricks needed to do it with successive denoising steps and interpretation in terms of diffusion SDEs. I am vaguely aware that this oversimplifies a rich and interesting history of convergence of many useful techniques, but have not invested enough time to claim actual expertise.

## 1 Training: score matching

Modern score matching seems to originate in Hyvärinen (2005). See score matching or McAllester (2023) for an introduction to the general idea.

## 5 Conditioning

There are lots of ways we might try to condition diffusions, differing sometimes only in emphasis.

### 5.1 Generic conditioning

Rozet and Louppe (2023a) summarises:

With score-based generative models, we can generate samples from the unconditional distribution $$p(x(0)) \approx p(x)$$. To solve inverse problems, however, we need to sample from the posterior distribution $$p(x \mid y)$$. This could be accomplished by training a conditional score network $$s_\phi(x(t), t \mid y)$$ to approximate the posterior score $$\nabla_{x(t)} \log p(x(t) \mid y)$$ and plugging it into the reverse SDE (4). However, this would require data pairs $$(x, y)$$ during training and one would need to retrain a new score network each time the observation process $$p(y \mid x)$$ changes. Instead, many have observed that the posterior score can be decomposed into two terms thanks to Bayes’ rule $\nabla_{x(t)} \log p(x(t) \mid y)=\nabla_{x(t)} \log p(x(t))+\nabla_{x(t)} \log p(y \mid x(t)) .$

Since the prior score $$\nabla_{x(t)} \log p(x(t))$$ can be approximated with a single score network, the remaining task is to estimate the likelihood score $$\nabla_{x(t)} \log p(y \mid x(t))$$. Assuming a differentiable measurement function $$\mathcal{A}$$ and a Gaussian observation process $$p(y \mid x)=\mathcal{N}\left(y \mid \mathcal{A}(x), \Sigma_y\right)$$, Chung et al. (2023) propose the approximation $p(y \mid x(t))=\int p(y \mid x) p(x \mid x(t)) \mathrm{d} x \approx \mathcal{N}\left(y \mid \mathcal{A}(\hat{x}(x(t))), \Sigma_y\right)$ where the mean $$\hat{x}(x(t))=\mathbb{E}_{p(x \mid x(t))}[x]$$ is given by Tweedie’s formula \begin{aligned} \mathbb{E}_{p(x \mid x(t))}[x] & =\frac{x(t)+\sigma(t)^2 \nabla_{x(t)} \log p(x(t))}{\mu(t)} \\ & \approx \frac{x(t)+\sigma(t)^2 s_\phi(x(t), t)}{\mu(t)} . \end{aligned}

As the log-likelihood of a multivariate Gaussian is known analytically and $$s_\phi(x(t), t)$$ is differentiable, we can compute the likelihood score $$\nabla_{x(t)} \log p(y \mid x(t))$$ with this approximation in zero-shot, that is, without training any other network than $$s_\phi(x(t), t)$$.

### 5.2 Inpainting

If we want coherence with some chunk of existing image, we call that inpainting. .

### 5.3 Super-resolution

Coherence, but with a sparse regular subset .

### 5.4 Reconstruction/inversion

Perturbed and partial observations .

## 7 Diffusion on weird spaces

Generic: Okhotin et al. (2023).

Li et al. (2024)

Baker Lab

## 8 Shapes

Diffusion-SDF: Conditional Generative Modeling of Signed Distance Functions – Princeton Computing Imaging Lab

## 9 Incoming

Suggestive connection to thermodynamics .

## 10 References

Adam, Coogan, Malkin, et al. 2022.
Ajay, Du, Gupta, et al. 2023. In.
Albergo, Boffi, and Vanden-Eijnden. 2023.
Albergo, Goldstein, Boffi, et al. 2023.
Albergo, and Vanden-Eijnden. 2023. In.
Anderson. 1982. Stochastic Processes and Their Applications.
Bastek, Sun, and Kochmann. 2024.
Choi, Kim, Jeong, et al. 2021. In.
Chou, Bahat, and Heide. 2023.
Chung, Kim, Mccann, et al. 2023. In.
Dhariwal, and Nichol. 2021. arXiv:2105.05233 [Cs, Stat].
Dockhorn, Vahdat, and Kreis. 2022. In.
Dutordoir, Saul, Ghahramani, et al. 2022.
Efron. 2011. Journal of the American Statistical Association.
Graikos, Malkin, Jojic, et al. 2022. Advances in Neural Information Processing Systems.
Grechka, Couairon, and Cord. 2024. Computer Vision and Image Understanding.
Guo, Liu, Wang, et al. 2024. Nature Reviews Bioengineering.
Haitsiukevich, Poyraz, Marttinen, et al. 2024.
Han, Zheng, and Zhou. 2022.
Heng, De Bortoli, Doucet, et al. 2022.
Ho, Jain, and Abbeel. 2020. arXiv:2006.11239 [Cs, Stat].
Hoogeboom, Gritsenko, Bastings, et al. 2021. arXiv:2110.02037 [Cs, Stat].
Hyvärinen. 2005. The Journal of Machine Learning Research.
Jalal, Arvinte, Daras, et al. 2021. In Advances in Neural Information Processing Systems.
Jo, Lee, and Hwang. 2022. In Proceedings of the 39th International Conference on Machine Learning.
Jolicoeur-Martineau, Piché-Taillefer, Mitliagkas, et al. 2022. In.
Kawar, Elad, Ermon, et al. 2022. Advances in Neural Information Processing Systems.
Kawar, Vaksman, and Elad. 2021. In.
Kim, and Ye. 2021. In.
Lipman, Chen, Ben-Hamu, et al. 2023.
Liu, Ziming, Luo, Xu, et al. 2023.
Liu, Anji, Niepert, and Broeck. 2023.
Liu, Chang, Zhuo, Cheng, et al. 2019. In Proceedings of the 36th International Conference on Machine Learning.
Li, Yu, He, et al. 2024. Proceedings of the AAAI Conference on Artificial Intelligence.
Lugmayr, Danelljan, Romero, et al. 2022. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
McAllester. 2023.
Nair, Mei, and Patel. 2023. In.
Nichol, and Dhariwal. 2021. In Proceedings of the 38th International Conference on Machine Learning.
Okhotin, Molchanov, Vladimir, et al. 2023. Advances in Neural Information Processing Systems.
Pang, Mao, He, et al. 2024. IEEE Access.
Pascual, Bhattacharya, Yeh, et al. 2022.
Peng, Qiu, Wynne, et al. 2024. Medical Physics.
Preechakul, Chatthee, Wizadwongsa, et al. 2022. In.
Radford, Kim, Hallacy, et al. 2021.
Rozet, and Louppe. 2023a.
Sharrock, Simons, Liu, et al. 2022.
Shim, Kang, and Joo. 2023. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
Sohl-Dickstein, Weiss, Maheswaranathan, et al. 2015. arXiv:1503.03585 [Cond-Mat, q-Bio, Stat].
Song, Yang, Durkan, Murray, et al. 2021. In Advances in Neural Information Processing Systems.
Song, Yang, and Ermon. 2020a. In Advances In Neural Information Processing Systems.
———. 2020b. In Advances In Neural Information Processing Systems.
Song, Yang, Garg, Shi, et al. 2019.
Song, Jiaming, Meng, and Ermon. 2021. arXiv:2010.02502 [Cs].
Song, Yang, Shen, Xing, et al. 2022. In.
Song, Yang, Sohl-Dickstein, Kingma, et al. 2022. In.
Sui, Ma, Zhang, et al. 2024.
Swersky, Ranzato, Buchman, et al. 2011. “On Autoencoders and Score Matching for Energy Based Models.” In Proceedings of the 28th International Conference on Machine Learning (ICML-11).
Torres, Leung, Lutz, et al. 2022.
Tzen, and Raginsky. 2019a. In Proceedings of the Thirty-Second Conference on Learning Theory.
Vincent. 2011. Neural Computation.
Watson, Juergens, Bennett, et al. 2022.
Wu, Trippe, Naesseth, et al. 2023. In.
Xie, and Li. 2022. In Medical Image Computing and Computer Assisted Intervention – MICCAI 2022.
Xu, Yilun, Liu, Tegmark, et al. 2022. In Advances in Neural Information Processing Systems.
Xu, Yilun, Liu, Tian, et al. 2023. In.
Xu, Mengze, Ma, and Zhu. 2023. IEEE Geoscience and Remote Sensing Letters.
Yang, Zhang, Song, et al. 2023. ACM Computing Surveys.
Zamir, Arora, Khan, et al. 2021.
Zhang, Ji, Zhang, et al. 2023. In Proceedings of the 40th International Conference on Machine Learning. ICML’23.
Zhao, Bai, Zhu, et al. 2023. In.
Zhuang, Abnar, Gu, et al. 2022. In.