Generative art with language+diffusion models

September 16, 2022 — October 2, 2024

buzzword
computers are awful
generative art
machine learning
making things
music
neural nets
photon choreography
Figure 1

Generative art using modern diffusion-backed image generators. The name-brand models are DALL-E 2, Stable Diffusion, Midjourney etc., which are diffusion + transformer models.

For audio stuff, see music diffusion.

1 BYO model

Let’s say I want to download or train my own models. Probably that means I am in the Huggingface ecosystem. At the time of writing, I might use the Stable Diffusion or FLUX models. Start from 🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch.

1.1 UI

If I use the Huggingface tooling, building a local UI is easy; it integrates easily with gradio.

Fancier UIs are possible, e.g.

comfyanonymous/ComfyUI: The most powerful and modular diffusion model GUI, API and backend with a graph/nodes interface.

Mac local UIs:

Other ones I bookmarked at some stage:

Figure 2

1.2 Model customisation

1.3 Model suppliers

Hugging Face is the heavy-hitter. See also Civitai:

Civitai is a labour of love from a small team. After being inspired daily by the incredible progress of the Stable Diffusion community and the explosion of custom fine-tuned models, textual inversions, and more, we wanted to see if we could create something that would continue to help the community grow and thrive.

After seeing a gap around sharing the custom models that were being made by the community, we decided to try our hand at putting together a tool that would make it easy for anyone to share, find, and review models. While there were existing services like HuggingFace that allowed users to expose their models as repositories, we felt that it was missing a few key features that would really allow it to serve as a home for the growing community and use case:

  • A way for creators to tag models with things that make sense to the SD community

  • A good way for people interested in the model to review and share their creations

  • A simpler upload and download interface (how many of us are really familiar with code repos)

  • An indexed and visual browsing experience of all the models available

  • An API that can be used by SD tools to tap into the growing library of models, embeds, aesthetic gradients, and hyper networks available

  • About the Project · civitai/civitai Wiki

2 Hosted models

Just go to a website, give someone money and get images back. Trade convenience for privacy and privacy.

2.1 Runway.ml

Runway.ml

a platform for creators of all kinds to use machine learning tools in intuitive ways without any coding experience. Find resources here to start creating with RunwayML quickly.

In particular, it plugs into Blender and Photoshop and allows you to use those programs as a UI for ML-backed algorithms. Nice.

2.2 Midjourney

Midjourney produces high-quality images from text prompts. Addictive in that you can get better at it, which feels like mastering a real skill.

2.3 Nightcafe

NightCafe Creator

Stable Diffusion, DALL-E 2, CLIP-Guided Diffusion, VQGAN+CLIP and Neural Style Transfer are all available on NightCafe.

2.4 Playgroundai

3 Punditry

4 Theory

5 Folk history of Stability

6 Incoming

7 References

Dhariwal, and Nichol. 2021. Diffusion Models Beat GANs on Image Synthesis.” arXiv:2105.05233 [Cs, Stat].
Dutordoir, Saul, Ghahramani, et al. 2022. Neural Diffusion Processes.”
Han, Zheng, and Zhou. 2022. CARD: Classification and Regression Diffusion Models.”
Ho, Jain, and Abbeel. 2020. Denoising Diffusion Probabilistic Models.” arXiv:2006.11239 [Cs, Stat].
Hoogeboom, Gritsenko, Bastings, et al. 2021. Autoregressive Diffusion Models.” arXiv:2110.02037 [Cs, Stat].
Nichol, and Dhariwal. 2021. Improved Denoising Diffusion Probabilistic Models.” In Proceedings of the 38th International Conference on Machine Learning.
Sohl-Dickstein, Weiss, Maheswaranathan, et al. 2015. Deep Unsupervised Learning Using Nonequilibrium Thermodynamics.” arXiv:1503.03585 [Cond-Mat, q-Bio, Stat].
Song, Yang, and Ermon. 2020a. Generative Modeling by Estimating Gradients of the Data Distribution.” In Advances In Neural Information Processing Systems.
———. 2020b. Improved Techniques for Training Score-Based Generative Models.” In Advances In Neural Information Processing Systems.
Song, Jiaming, Meng, and Ermon. 2021. Denoising Diffusion Implicit Models.” arXiv:2010.02502 [Cs].
von Platen, Patil, Lozhkov, et al. 2022. Diffusers: State-of-the-Art Diffusion Models.”
Yang, Zhang, Song, et al. 2023. Diffusion Models: A Comprehensive Survey of Methods and Applications.” ACM Computing Surveys.