Generative art with language+diffusion models
September 16, 2022 — October 2, 2024
Generative art using modern diffusion-backed image generators. The name-brand models are DALL-E 2, Stable Diffusion, Midjourney etc., which are diffusion + transformer models.
For audio stuff, see music diffusion.
1 BYO model
Let’s say I want to download or train my own models. Probably that means I am in the Huggingface ecosystem. At the time of writing, I might use the Stable Diffusion or FLUX models. Start from 🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch.
- FLUX.1 [dev] - a Hugging Face Space by black-forest-labs
- black-forest-labs/FLUX.1-dev · Hugging Face
- black-forest-labs/FLUX.1-schnell · Hugging Face
- stabilityai/stable-diffusion-xl-base-1.0 · Hugging Face
- stabilityai/stable-diffusion-3-medium-diffusers · SD3 WebUI generate by Gradio
1.1 UI
If I use the Huggingface tooling, building a local UI is easy; it integrates easily with gradio.
Fancier UIs are possible, e.g.
Mac local UIs:
Other ones I bookmarked at some stage:
DiffusionBee - Stable Diffusion App for AI Art /divamgupta/diffusionbee-stable-diffusion-ui: Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
invoke-ai/InvokeAI Linux, Windows and macOS
InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry-leading WebUI, supports terminal use through a CLI, and serves as the foundation for multiple commercial products.
AUTOMATIC1111/stable-diffusion-webui: Stable Diffusion web UI
A browser interface based on Gradio library for Stable Diffusion.
NMKD Stable Diffusion GUI - AI Image Generator by N00MKRAD
A handy GUI to run Stable Diffusion, a machine learning toolkit to generate images from text, locally on your own hardware.
It is completely uncensored and unfiltered - I am not responsible for any of the content generated with it. No data is shared/collected by me or any third party.
1.2 Model customisation
- That Pokemon diffusion post: Adventures in Finetuning Stable Diffusion.
1.3 Model suppliers
Hugging Face is the heavy-hitter. See also Civitai:
Civitai is a labour of love from a small team. After being inspired daily by the incredible progress of the Stable Diffusion community and the explosion of custom fine-tuned models, textual inversions, and more, we wanted to see if we could create something that would continue to help the community grow and thrive.
After seeing a gap around sharing the custom models that were being made by the community, we decided to try our hand at putting together a tool that would make it easy for anyone to share, find, and review models. While there were existing services like HuggingFace that allowed users to expose their models as repositories, we felt that it was missing a few key features that would really allow it to serve as a home for the growing community and use case:
A way for creators to tag models with things that make sense to the SD community
A good way for people interested in the model to review and share their creations
A simpler upload and download interface (how many of us are really familiar with code repos)
An indexed and visual browsing experience of all the models available
An API that can be used by SD tools to tap into the growing library of models, embeds, aesthetic gradients, and hyper networks available
2 Hosted models
Just go to a website, give someone money and get images back. Trade convenience for privacy and privacy.
2.1 Runway.ml
a platform for creators of all kinds to use machine learning tools in intuitive ways without any coding experience. Find resources here to start creating with RunwayML quickly.
In particular, it plugs into Blender and Photoshop and allows you to use those programs as a UI for ML-backed algorithms. Nice.
2.2 Midjourney
Midjourney produces high-quality images from text prompts. Addictive in that you can get better at it, which feels like mastering a real skill.
2.3 Nightcafe
Stable Diffusion, DALL-E 2, CLIP-Guided Diffusion, VQGAN+CLIP and Neural Style Transfer are all available on NightCafe.
2.4 Playgroundai
3 Punditry
4 Theory
5 Folk history of Stability
6 Incoming
- Reddit for AI-generated and manipulated content