Wait, Stable Diffusion has HOW many forks? The open source explosion in numbers.
Three months after Stable Diffusion's release, I counted 847 forks and derivative projects on GitHub. The rate of open source AI proliferation is unlike anything I've seen in tech.
I went down a rabbit hole this weekend.
I wanted to count how many derivative projects have spawned from Stable Diffusion since its August release. I started with GitHub's fork count. Then I started searching for repos that use Stable Diffusion as a dependency without forking. Then custom UIs. Then fine-tuned models.
Three hours later I had a spreadsheet with 847 entries. Three months. 847 derivative projects.
That number needs context.
The fork count
| Repository | GitHub stars | Forks | Created | |-----------|-------------|-------|---------| | CompVis/stable-diffusion (original) | 42,000+ | 6,500+ | Aug 10, 2022 | | AUTOMATIC1111/stable-diffusion-webui | 38,000+ | 7,200+ | Aug 16, 2022 | | Stability-AI/stablediffusion | 18,000+ | 2,100+ | Oct 2022 |
AUTOMATIC1111's web UI has more forks than the original repository. That's unusual. It means the community UI became the de facto starting point for most developers, not the official code.
But these are just direct forks. The real number is much bigger.
Beyond forks: the derivative project universe
I categorized the 847 projects I found into types:
| Category | Count | Examples | |----------|-------|---------| | Alternative UIs | 34 | InvokeAI, DiffusionBee, Easy Diffusion | | Fine-tuned models | 280+ | Dreamlike Diffusion, Openjourney, Anything V3 | | Plugins/extensions | 120+ | ControlNet (later), img2img tools, upscalers | | Integration tools | 95 | Photoshop plugins, Blender addons, Figma tools | | Mobile/edge deployment | 28 | iOS apps, Android wrappers, Raspberry Pi builds | | Training tools | 45 | DreamBooth scripts, LoRA trainers, textual inversion | | Hosting/inference services | 62 | RunPod templates, Replicate models, HF Spaces | | Wrappers/APIs | 78 | Discord bots, Slack bots, web APIs | | Research/experiments | 105 | Video generation, 3D from 2D, animation |
The fine-tuned models category is the largest at 280+. Sites like CivitAI are becoming model marketplaces. You can find Stable Diffusion variants trained on anime, architecture, fashion, landscapes, portraits, and hundreds of other niches.
Speed of proliferation
This is what really caught my attention. I plotted when the derivative projects were created:
| Week (post-release) | New projects that week | Cumulative | |---------------------|----------------------|------------| | Week 1 (Aug 10-16) | 12 | 12 | | Week 2 (Aug 17-23) | 38 | 50 | | Week 3 (Aug 24-30) | 67 | 117 | | Week 4 (Aug 31-Sep 6) | 89 | 206 | | Week 5-8 (Sep) | 245 | 451 | | Week 9-12 (Oct) | 218 | 669 | | Week 13-15 (Nov 1-20) | 178 | 847 |
The rate peaked in September (about 61 new projects per week) and has slightly declined since. But even in November, roughly 59 new derivative projects per week are being created. That's 8-9 per day.
For comparison, I checked a few other major open source releases:
| Project | Forks after 3 months | Derivative projects (est.) | |---------|---------------------|---------------------------| | Stable Diffusion (2022) | 6,500+ | 847+ projects | | GPT-J (2021) | ~400 | ~50 projects | | TensorFlow (2015) | ~2,000 | ~200 projects | | React (2013) | ~800 | ~100 projects | | Linux kernel (1991) | Different era | Not comparable |
Stable Diffusion's derivative project count in 3 months is larger than what GPT-J or TensorFlow generated in the same timeframe. And TensorFlow had Google's marketing machine behind it.
Why Stable Diffusion spread faster than other open source AI
Three factors, all visible in the data.
1. Low hardware bar. Stable Diffusion runs on consumer GPUs with 8GB VRAM. GPT-J requires at least a V100 for reasonable inference. Stable Diffusion can run on a $300 used RTX 3060. This expanded the potential contributor pool from "people with access to cloud GPUs" to "anyone with a gaming PC."
2. Visual outputs are shareable. When someone fine-tunes a language model, the results are text. When someone fine-tunes Stable Diffusion, the results are images you can post on Reddit, Twitter, and Discord. Visual outputs drive social sharing, which drives awareness, which drives more contributors.
3. Stability AI chose an open license. The CreativeML Open RAIL-M license allows commercial use, modification, and redistribution. If they'd used a non-commercial license (like some academic releases), the integration tools and hosting services categories would be nearly empty.
The CivitAI phenomenon
I need to call out CivitAI specifically because it represents something new. CivitAI is a marketplace for Stable Diffusion model variants. Users upload fine-tuned models (typically trained with DreamBooth or textual inversion on specific aesthetic styles) and others download and use them.
As of late November 2022:
| CivitAI metric | Count | |---------------|-------| | Listed models | 500+ | | Total downloads (est.) | 2M+ | | Active uploaders | 200+ | | Unique styles/niches | 100+ |
This is a model marketplace built by the community in three months. It didn't exist in August. Nobody planned it. It emerged because there was demand for specialized versions of Stable Diffusion and no central authority controlling distribution.
What this means
Open source AI proliferation at this speed has implications I'm still processing.
First, the cat is genuinely out of the bag. You cannot recall 847 derivative projects. Even if Stability AI changed the license tomorrow (they won't, but hypothetically), the code is forked, the models are trained, and the community is self-sustaining.
Second, the speed creates a moderation problem. Those 280+ fine-tuned models include some trained on datasets that raise ethical questions. The open nature means anyone can fine-tune on anything. The community is self-policing to some degree (CivitAI has content policies), but enforcement is spotty.
Third, the rate of improvement is community-driven now. Stability AI is one contributor among hundreds. The AUTOMATIC1111 web UI adds features faster than the official release. Community extensions and models are pushing the technology in directions no single organization would prioritize.
I've been tracking open source AI adoption for two years. Nothing I've seen comes close to this. The spreadsheet is 847 rows and growing. I'll do a 6-month check-in.
If you found this interesting, you might also like:
- I counted every AI startup that raised money in Q1 2021. The numbers are strange.
- Open source AI is having a moment. Here are the download numbers.
- The GPT-3 API waitlist is 6 months long. Here's what the early data looks like.
- Hugging Face just hit 10,000 models. Here's what the model zoo looks like.
- Anthropic just raised $580M. Let's talk about the AI safety funding numbers.
-- dataku