Neural denoising diffusion models
Denoising diffusion probabilistic models (DDPMs), score-based generative models, generative diffusion processes, neural energy models…
2021-11-10 — 2025-05-28
Placeholder.
The archetypal neural denoising diffusions use score-matching to learn and score diffusion to sample. I won’t explain more than that because this is a super hot area and there are tutorials of sparkling magnificence available, true works of androgogical art. Anything I had time to write would do those works a disservice.
Related names: Denoising Diffusion Probabilistic Models (DDPMs), score-based generative models, generative diffusion processes, neural energy models, denoising diffusion models, denoising diffusion probabilistic models, diffusion probabilistic models, diffusion models, denoising score matching, denoising score matching with Langevin dynamics, denoising score matching with Langevin sampling.
Suggestive connection to thermodynamics (Sohl-Dickstein et al. 2015) and indeed, statistical mechanics of learning.
1 Tutorials
- Das, Building Diffusion Model’s theory from ground up for ICLR Blogposts 2024
- Lilian Weng, What are Diffusion Models?
- Yang Song, Generative Modelling by Estimating Gradients of the Data Distribution
- Sander Dieleman, Diffusion models are autoencoders and Perspectives on diffusion
- CVPR tutorial, Denoising Diffusion-based Generative Modelling: Foundations and Applications Accompanying video
- What’s the score? (Review of latest Score Based Generative Modelling papers.)
- Anil Ananthaswamy, The Physics Principle That Inspired Modern AI Art
- The geometry of data: the missing metric tensor and the Stein score [Part II] | Terra Incognita
- Thoughts on Riemannian metrics and its connection with diffusion/score matching [Part I] | Terra Incognita
2 Sampling
Diffusion models use a score function to sample from a distribution. Unlike classic Langevin samplers, they don’t sample in the data space but construct an artificial diffusion process in a latent space, which produces samples in the data space at the end of the process. I refer to those by the inadequate shorthand of “Score diffusions” purely to disambiguate them from the many other uses of the term “diffusion” in my life. In particular, they are diffusion SDEs (although that was not obvious at the birth of this field) but of a very particular type (reversed!) and with a very particular purpose (sampling).
3 Training: score matching
Modern score matching seems to originate in Hyvärinen (2005), although the original approach was not performant. See score matching or McAllester (2023) for an intro to the general idea of learning the score of a distribution even when you cannot evaluate it. It is wild that this works.
4 Image generation in particular
5 Conditioning
There are lots of ways we might try to condition diffusions, differing sometimes only in emphasis. See neural denoising diffusion models with conditioning for a detailed discussion of the various approaches.
6 Latent
For some ideas about latent representations and their coupling with diffusion models see Multimodal AI.
6.1 Generic
7 Diffusion on weird spaces
See non-Gaussian diffusion for using diffusion models on non-Euclidean spaces.
7.1 Language
See language models for using diffusion models in NLP.
7.2 Solutions satisfying physical constraints
See PDE diffusion models for using diffusion models to generate PDE solutions.
7.3 On PD manifolds
7.4 Proteins
Baker Lab (Torres et al. 2022; Watson et al. 2022)
7.5 Shapes
8 Heavy-tailed
9 Flow matching
See neural flow matching models for a discussion of flow matching, which is a closely related approach to diffusion models.