Generative music with language+diffusion models


A special class of generative AI for music. For other alternatives, see nn music.

Here we consider specifically generative music using diffusion models, much like the diffusion image synthesis.

(Chen et al. 2020; Goel et al. 2022; Hernandez-Olivan, Hernandez-Olivan, and Beltran 2022; Kreuk, Taigman, et al. 2022; Kreuk, Synnaeve, et al. 2022; Lee and Han 2021; Pascual et al. 2022; von Platen et al. 2022)



Chen, Nanxin, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, and William Chan. 2020. WaveGrad: Estimating Gradients for Waveform Generation.” arXiv.
Copet, Jade, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi, and Alexandre Défossez. 2023. Simple and Controllable Music Generation.” arXiv.
Goel, Karan, Albert Gu, Chris Donahue, and Christopher Ré. 2022. It’s Raw! Audio Generation with State-Space Models.” arXiv.
Hernandez-Olivan, Carlos, Javier Hernandez-Olivan, and Jose R. Beltran. 2022. A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives.” arXiv.
Kong, Zhifeng, Wei Ping, Jiaji Huang, Kexin Zhao, and Bryan Catanzaro. 2021. DiffWave: A Versatile Diffusion Model for Audio Synthesis.” arXiv.
Kreuk, Felix, Gabriel Synnaeve, Adam Polyak, Uriel Singer, Alexandre Défossez, Jade Copet, Devi Parikh, Yaniv Taigman, and Yossi Adi. 2022. AudioGen: Textually Guided Audio Generation.” arXiv.
Kreuk, Felix, Yaniv Taigman, Adam Polyak, Jade Copet, Gabriel Synnaeve, Alexandre Défossez, and Yossi Adi. 2022. Audio Language Modeling Using Perceptually-Guided Discrete Representations.” arXiv.
Lee, Junhyeok, and Seungu Han. 2021. NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling.” In Interspeech 2021, 1634–38.
Pascual, Santiago, Gautam Bhattacharya, Chunghsin Yeh, Jordi Pons, and Joan Serrà. 2022. Full-Band General Audio Synthesis with Score-Based Diffusion.” arXiv.
Platen, Patrick von, Suraj Patil, Anton Lozhkov, Pedro Cuenca, Nathan Lambert, Kashif Rasul, Mishig Davaadorj, and Thomas Wolf. 2022. Diffusers: State-of-the-Art Diffusion Models.” GitHub.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.