Generative music with language+diffusion models



Placeholder. For now see nn_music.

generative art using diffusion models, much like the diffusion image synthesis page.

References

Chen, Nanxin, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, and William Chan. 2020. WaveGrad: Estimating Gradients for Waveform Generation.” arXiv.
Goel, Karan, Albert Gu, Chris Donahue, and Christopher Ré. 2022. It’s Raw! Audio Generation with State-Space Models.” arXiv.
Hernandez-Olivan, Carlos, Javier Hernandez-Olivan, and Jose R. Beltran. 2022. A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives.” arXiv.
Kong, Zhifeng, Wei Ping, Jiaji Huang, Kexin Zhao, and Bryan Catanzaro. 2021. DiffWave: A Versatile Diffusion Model for Audio Synthesis.” arXiv.
Kreuk, Felix, Gabriel Synnaeve, Adam Polyak, Uriel Singer, Alexandre Défossez, Jade Copet, Devi Parikh, Yaniv Taigman, and Yossi Adi. 2022. AudioGen: Textually Guided Audio Generation.” arXiv.
Kreuk, Felix, Yaniv Taigman, Adam Polyak, Jade Copet, Gabriel Synnaeve, Alexandre Défossez, and Yossi Adi. 2022. Audio Language Modeling Using Perceptually-Guided Discrete Representations.” arXiv.
Lee, Junhyeok, and Seungu Han. 2021. NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling.” In Interspeech 2021, 1634–38.
Pascual, Santiago, Gautam Bhattacharya, Chunghsin Yeh, Jordi Pons, and Joan Serrà. 2022. Full-Band General Audio Synthesis with Score-Based Diffusion.” arXiv.
Platen, Patrick von, Suraj Patil, Anton Lozhkov, Pedro Cuenca, Nathan Lambert, Kashif Rasul, Mishig Davaadorj, and Thomas Wolf. 2022. Diffusers: State-of-the-Art Diffusion Models.” GitHub.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.