The GPU shortage data: who has capacity and who's lying about it
I surveyed 40 AI companies about GPU access. 78% reported 'severe constraints.' But cloud provider utilization data tells a slightly different story. Some companies have more H100s than they're admitting.
Everyone in AI talks about the GPU shortage. "We can't get H100s." "The wait time is 6 months." "NVIDIA is allocation-constrained."
I wanted to move past anecdotes and get actual numbers. So I surveyed 40 AI companies (startups, mid-size, and a few large enterprises) and cross-referenced their responses with cloud provider utilization data.
The picture is more complicated than "there aren't enough GPUs."
Survey results: 40 AI companies on GPU access
| Question | Response | Count | % | |----------|----------|-------|---| | GPU access status | "Severe constraints" | 31 | 78% | | | "Some constraints" | 6 | 15% | | | "Adequate access" | 3 | 8% | | Wait time for new H100s | Under 1 month | 4 | 10% | | | 1-3 months | 9 | 23% | | | 3-6 months | 16 | 40% | | | Over 6 months | 8 | 20% | | | "We can't get any" | 3 | 8% | | Primary GPU source | Cloud (AWS/GCP/Azure) | 22 | 55% | | | Dedicated (CoreWeave, Lambda) | 10 | 25% | | | Self-owned | 5 | 13% | | | Multiple sources | 3 | 8% |
Source: My survey of 40 AI companies, conducted September-October 2023. Companies ranged from seed-stage to public, all with active AI workloads.
78% report "severe constraints." But what does that mean in practice? When I dug deeper:
| What "severe constraints" actually means | Count (of 31) | % | |----------------------------------------|---------------|---| | "Can't get enough GPUs to train our next model" | 14 | 45% | | "Wait times for cloud instances are too long" | 9 | 29% | | "Costs are too high, not availability" | 5 | 16% | | "Genuinely cannot access any H100s" | 3 | 10% |
Only 10% of "severely constrained" companies literally cannot access H100s. For 16%, the problem isn't availability but pricing. They could get GPUs but can't afford them at current rates. The remaining 74% can get GPUs but not as many or as fast as they want.
There's a meaningful difference between "GPU shortage" and "GPU pricing is too high for our budget."
H100 pricing data
| Provider | H100 80GB price ($/hour) | Availability (Oct 2023) | Min commitment | |----------|------------------------|------------------------|----------------| | AWS p5.48xlarge (8x H100) | ~$7.50/GPU | Wait list (2-4 weeks) | On-demand | | Google Cloud a3-highgpu-8g | ~$7.20/GPU | Limited regions | 1 year reserved | | Azure ND H100 v5 | ~$7.00/GPU | Wait list (3-6 weeks) | On-demand or reserved | | CoreWeave | ~$2.78/GPU | Available (spot) | Monthly | | Lambda Labs | ~$2.49/GPU | Available (some regions) | On-demand | | Colocation (own hardware) | ~$0.80-$1.20/GPU (amortized) | Need to buy H100s first | Capital expense |
Sources: Provider pricing pages, conversations with customers, SemiAnalysis, October 2023.
The price spread is remarkable. AWS charges 3x what Lambda Labs charges for the same GPU. CoreWeave and Lambda have H100s available at spot or near-spot prices when the big three cloud providers have wait lists.
This suggests the "shortage" is partly an allocation problem, not a supply problem. H100s exist. They're just concentrated at certain providers, and the biggest providers (AWS, GCP, Azure) have the longest queues because demand there is highest.
The companies with more GPUs than they admit
Here's the part that makes some people uncomfortable. Several large AI companies that publicly discuss GPU constraints appear to have significant GPU clusters:
| Company (public info) | Reported GPU cluster size | Source | |----------------------|--------------------------|--------| | Meta | 21,000+ A100s, ordering H100s | Meta earning calls, press | | Google | Unknown but massive (TPUs + GPUs) | Google Cloud reports | | Microsoft/OpenAI | Reportedly 10,000+ H100s | Press reporting | | Oracle | Building "supercluster" with 16,000+ GPUs | Oracle Cloud announcements | | CoreWeave | 14,000+ GPUs (mixed A100/H100) | CoreWeave funding announcements |
Sources: Earnings calls, press reporting, company announcements. Numbers are approximate and likely understated.
Meta has over 21,000 A100s and is actively ordering H100s. When Meta says they're "constrained," they mean they want 100,000 GPUs and can only get 30,000. That's a different kind of constraint than a startup that wants 8 GPUs and can't get any.
The demand side
Why is demand so high? I tracked what companies in my survey are using GPUs for:
| Use case | % of surveyed companies | Avg GPUs needed | |----------|------------------------|-----------------| | Model training (from scratch) | 15% | 128-1,024 | | Fine-tuning existing models | 45% | 4-32 | | Inference serving | 30% | 8-64 | | Research/experimentation | 10% | 1-8 |
Most companies need GPUs for fine-tuning and inference, not training from scratch. The 15% training from scratch are the ones driving the mega-demand for thousands of GPUs. The 85% doing fine-tuning and inference need fewer GPUs but still face availability issues because the training workloads are absorbing so much capacity.
NVIDIA's production data
NVIDIA's quarterly revenue tells the supply story:
| Quarter | Data center GPU revenue | YoY growth | |---------|------------------------|-----------| | Q3 FY2023 (Oct 2022) | $3.8B | +31% | | Q4 FY2023 (Jan 2023) | $3.6B | +11% | | Q1 FY2024 (Apr 2023) | $4.3B | +14% | | Q2 FY2024 (Jul 2023) | $10.3B | +171% |
Source: NVIDIA quarterly earnings reports.
Q2 FY2024 (the April-July quarter) saw data center revenue jump from $4.3B to $10.3B. That's NVIDIA shipping $10 billion worth of data center GPUs in a single quarter. The supply is ramping. It's just not ramping fast enough to meet the demand surge.
My take
The "GPU shortage" is real but overstated. Here's the more accurate picture:
-
H100s are available if you look beyond the big three cloud providers. CoreWeave, Lambda Labs, and smaller providers have capacity.
-
The shortage is really a pricing problem for many companies. H100s at $7/hour (AWS) are "unavailable" for most budgets. The same H100 at $2.50/hour (Lambda) is available today.
-
Large companies are hoarding. Meta, Microsoft, and Google are buying H100s in the tens of thousands. This creates a squeeze on the open market, but it also means the "shortage" is partly a concentration problem.
-
NVIDIA is shipping like crazy. $10.3B in one quarter. The supply is coming. By mid-2024, I expect spot H100 prices to drop 30-40% as supply catches up.
The real question isn't "are there enough GPUs?" It's "are there enough affordable GPUs for the companies that need them?" And the answer to that is: not yet, but soon.
If you found this interesting, you might also like:
- The GPT-3 API waitlist is 6 months long. Here's what the early data looks like.
- I counted every AI startup that raised money in Q1 2021. The numbers are strange.
- Open source AI is having a moment. Here are the download numbers.
- Anthropic just raised $580M. Let's talk about the AI safety funding numbers.
- Wait, Stable Diffusion has HOW many forks? The open source explosion in numbers.
-- dataku