Pricing WatchJanuary 22, 20246 min read

Every LLM API price drop in the last 12 months, in one chart

I logged every API price change since January 2023. There have been 23 price drops across 8 providers. The average price of a million output tokens fell 78%. I've never seen deflation this fast in tech.

I keep a spreadsheet. You know this about me by now.

This particular spreadsheet tracks every public LLM API price change from every major provider. I started it in January 2023 when there were three providers worth tracking. Now there are eight. The spreadsheet has 23 rows of price drops and exactly zero price increases.

Let me show you what 12 months of AI deflation looks like.

The complete price drop timeline

| Date | Provider | Model | Change | New price ($/M output tokens) | |------|----------|-------|--------|-------------------------------| | Jan 2023 | OpenAI | ChatGPT API launch | New product | $2.00 | | Mar 2023 | OpenAI | GPT-4 launch | New product | $60.00 | | Apr 2023 | Anthropic | Claude v1 API launch | New product | $32.68 | | Jun 2023 | OpenAI | GPT-3.5-turbo | -25% | $1.50 | | Jul 2023 | Anthropic | Claude 2 launch | -27% vs Claude 1 | $24.00 | | Jul 2023 | Cohere | Command | -40% | $15.00 | | Aug 2023 | AI21 Labs | Jurassic-2 Ultra | -30% | $12.00 | | Sep 2023 | OpenAI | GPT-3.5-turbo (fine-tune) | New product | $6.00 | | Oct 2023 | Anthropic | Claude 2.1 | -20% | $24.00 | | Nov 2023 | OpenAI | GPT-4 Turbo | -67% vs GPT-4 | $20.00 | | Nov 2023 | OpenAI | GPT-3.5-turbo-1106 | -50% | $1.00 | | Nov 2023 | Google Cloud | Gemini Pro | Launch price | $0.50 | | Dec 2023 | Mistral AI | Mistral Medium | Launch price | $8.10 | | Dec 2023 | Mistral AI | Mistral Small | Launch price | $0.60 |

Sources: Official pricing pages, announcement blog posts, API documentation. All prices are per million output tokens for the standard tier.

That's just 2023. And every single entry is a downward movement or a new product entering at a lower price point than the previous best.

The deflation curve

Here's the number that matters: the price of the best available model at "GPT-3.5 quality" over time.

| Month | Best price for GPT-3.5-tier quality ($/M output) | Provider | Drop from prior | |-------|--------------------------------------------------|----------|----------------| | Jan 2023 | $2.00 | OpenAI GPT-3.5-turbo | Baseline | | Jun 2023 | $1.50 | OpenAI GPT-3.5-turbo | -25% | | Nov 2023 | $1.00 | OpenAI GPT-3.5-turbo-1106 | -33% | | Dec 2023 | $0.60 | Mistral AI Mistral Small | -40% | | Jan 2024 | $0.44 | Self-hosted Mixtral 8x7B (via providers) | -27% |

Sources: Provider pricing pages, my calculations for hosted open source inference.

From $2.00 to $0.44 in 12 months. That's a 78% decline. In one year.

And for GPT-4-tier quality:

| Month | Best price for GPT-4-tier quality ($/M output) | Provider | Drop from prior | |-------|------------------------------------------------|----------|----------------| | Mar 2023 | $60.00 | OpenAI GPT-4 | Baseline | | Jul 2023 | $60.00 | OpenAI GPT-4 (no change) | 0% | | Nov 2023 | $20.00 | OpenAI GPT-4 Turbo | -67% | | Jan 2024 | $20.00 | OpenAI GPT-4 Turbo (still) | 0% |

GPT-4-tier pricing has been stickier. Only one significant drop (GPT-4 Turbo in November). But I think 2024 is going to change that fast. Open source models are approaching GPT-4 quality, and when they arrive, the same deflationary pressure will hit the premium tier.

How this compares to other tech deflation

I got curious whether this rate of price decline has any precedent. So I looked at other technology pricing curves:

| Technology | Time period | Price decline | Decline per year | |-----------|------------|--------------|-----------------| | Cloud storage (per GB) | 2010-2020 | -95% | ~26%/yr | | LCD displays (per inch) | 2000-2010 | -90% | ~21%/yr | | DNA sequencing (per genome) | 2007-2017 | -99.99% | ~58%/yr | | LLM API (per M tokens, GPT-3.5 tier) | Jan 2023 - Jan 2024 | -78% | 78%/yr | | Internet bandwidth (per Mbps) | 2000-2010 | -80% | ~15%/yr |

Sources: Our World in Data, Statista, NHGRI Genome Sequencing Costs database, my LLM pricing data.

Only DNA sequencing during the post-Human-Genome-Project era showed comparable annual price deflation. LLM API pricing is declining at a rate that puts it in the same category as the most dramatic technology cost curves in history.

I did NOT expect to write that sentence when I started this spreadsheet a year ago.

What's driving the drops

Three forces, each reinforcing the others:

1. Competition. In January 2023, OpenAI had essentially a monopoly on quality LLM APIs. By January 2024, there are 8+ providers offering competitive products. More providers means price pressure.

2. Open source. Mixtral 8x7B set a price floor. If anyone can run a GPT-3.5-quality model for $0.18/M tokens self-hosted, API providers can't charge $2.00 for the same quality forever.

3. Efficiency improvements. Better serving frameworks (vLLM, TensorRT-LLM), quantization, and MoE architectures all reduce the cost of serving a given quality level. The hardware cost per token is dropping independently of pricing strategy.

My 2024 price predictions

Based on the trajectory:

  • GPT-3.5-tier quality will hit $0.10/M output tokens by July 2024 (currently $0.44)
  • GPT-4-tier quality will drop below $5/M output tokens by October 2024 (currently $20)
  • At least one provider will offer a "free tier" with meaningful limits for GPT-3.5-quality inference
  • The total addressable market for LLM APIs will grow 5-10x because lower prices make viable use cases that weren't possible at $2+/M tokens

I'll score these in December.

The spreadsheet gets a new row almost every week now. My favorite row is still the first one. $2.00 per million tokens felt expensive at the time. Looking back from $0.44, it feels like a different era.

That era was 12 months ago.


If you found this interesting, you might also like:

-- dataku

More from dataku