GPT-4 Turbo is 3x cheaper. Here's what that means for the API pricing war.

OpenAI just announced GPT-4 Turbo at DevDay, and the price drop is aggressive.

GPT-4 input went from $0.03/1K tokens to $0.01/1K tokens. Output from $0.06 to $0.03. That's a 3x reduction on input and 2x on output. Plus 128K context. Plus a knowledge cutoff of April 2023 instead of September 2021.

Let me update the pricing table I've been maintaining all year.

The updated master pricing table (November 2023)

| Provider | Model | Input ($/1K tokens) | Output ($/1K tokens) | Context | Notes | |----------|-------|---------------------|----------------------|---------|-------| | OpenAI | GPT-4 Turbo | $0.010 | $0.030 | 128K | New | | OpenAI | GPT-4 (original) | $0.030 | $0.060 | 8K | Still available | | OpenAI | GPT-4 32K | $0.060 | $0.120 | 32K | Still available | | OpenAI | GPT-3.5-turbo | $0.001 | $0.002 | 16K | Price dropped | | Anthropic | Claude 2.1 | $0.008 | $0.024 | 200K | Extended context | | Anthropic | Claude Instant | $0.0008 | $0.0024 | 100K | Cheapest Claude | | Google | PaLM 2 (text-bison) | $0.00025 | $0.0005 | 8K | Cheapest major API | | Mistral AI | Mistral 7B (hosted) | ~$0.0002 | ~$0.0002 | 8K | Via third parties | | Open source (self-hosted) | Llama 2 70B | ~$0.0004 | ~$0.0004 | 4K | Lambda Labs pricing |

Sources: Official pricing pages from each provider, November 2023.

The price drops in 2023, in order

This is the chart I'm most proud of. Every significant API price change in 2023:

| Date | Provider | Model | Old price (output/1K) | New price (output/1K) | Drop | |------|----------|-------|-----------------------|-----------------------|------| | Jan | OpenAI | GPT-3.5-turbo launch | N/A | $0.002 | New (10x cheaper than davinci) | | Mar | OpenAI | GPT-4 launch | N/A | $0.060 | New | | Jun | OpenAI | GPT-3.5-turbo-16K | N/A | $0.004 | New (2x of 4K for 4x context) | | Jul | Anthropic | Claude 2 | $0.024 | $0.024 | No change (but 100K context) | | Aug | OpenAI | GPT-3.5-turbo fine-tuning | N/A | $0.012 (training) | New capability | | Nov | OpenAI | GPT-4 Turbo | $0.060 | $0.030 | -50% | | Nov | OpenAI | GPT-3.5-turbo | $0.002 | $0.002 | No change (input halved to $0.001) |

The trend is unmistakably downward. And the biggest drop happened at the frontier: GPT-4 went from $0.06 output to $0.03, while getting a 16x bigger context window (8K to 128K). More capability for less money.

What GPT-4 Turbo means for the competitive picture

Let me show you the cost per million output tokens for comparable-quality models, before and after DevDay:

| Quality tier | Before DevDay | After DevDay | Change | |-------------|--------------|-------------|--------| | GPT-4 level | $60.00/M tokens | $30.00/M tokens | -50% | | GPT-3.5 level (OpenAI) | $2.00/M tokens | $2.00/M tokens | No change | | GPT-3.5 level (open source hosted) | $0.40-0.90/M tokens | $0.40-0.90/M tokens | No change |

The GPT-4 tier halved in price, but the GPT-3.5 tier stayed flat. This compresses the price-to-quality ratio significantly.

Before DevDay, GPT-4 cost 30x more than GPT-3.5-turbo. After DevDay, it costs 15x more. For many applications where GPT-4 quality is needed, the math just got much more favorable.

The 128K context changes the economics too

GPT-4 Turbo with 128K context is actually cheaper than the old GPT-4 32K:

| Model | Cost for a 30K token prompt + 2K response | |-------|------------------------------------------| | GPT-4 32K (old) | $1.80 input + $0.24 output = $2.04 | | GPT-4 Turbo 128K (new) | $0.30 input + $0.06 output = $0.36 |

That's 5.7x cheaper for the same job, with 4x more available context. If you were using GPT-4 32K for long document tasks, the upgrade to GPT-4 Turbo is essentially a mandatory switch.

For the Anthropic comparison, Claude 2.1 with 200K context:

| Model | Cost for 100K token prompt + 2K response | |-------|------------------------------------------| | Claude 2.1 | $0.80 input + $0.048 output = $0.85 | | GPT-4 Turbo | $1.00 input + $0.06 output = $1.06 |

Claude 2.1 is still cheaper for very long context tasks (~20% cheaper at 100K tokens). But the gap has shrunk dramatically. And GPT-4 Turbo can now handle 128K tokens, closing most of Anthropic's context window advantage.

The open source cost gap is narrowing

This is the trend that matters most. The cost premium for using proprietary APIs vs. self-hosting open source:

| Comparison | Cost ratio (Jan 2023) | Cost ratio (Nov 2023) | |-----------|----------------------|-----------------------| | GPT-3.5 vs self-hosted Llama 2 | 5x | 5x | | GPT-4 vs self-hosted Llama 2 | 150x | 75x | | GPT-4 Turbo vs hosted Llama 2 (Together AI) | N/A | 33x |

GPT-4 Turbo at $30/M output tokens vs Llama 2 on Together AI at $0.90/M is still a 33x gap. But before DevDay it was 67x. And the quality gap between GPT-4 and Llama 2 70B is real, so some of that premium is justified.

The real question is whether Llama 2 (or Mistral, or the next open source model) can close the quality gap to within 90% of GPT-4 Turbo. If it does, the 33x cost premium becomes very hard to defend.

My updated pricing predictions

For year-end 2024:

| Prediction | Confidence | |-----------|------------| | GPT-4 quality will cost under $10/M output tokens | 80% | | Open source models matching GPT-4 quality on most tasks | 65% | | At least 5 providers offering GPT-4 quality APIs | 85% | | Someone offers GPT-3.5 quality for under $0.10/M tokens | 90% |

The pricing war is accelerating. DevDay was OpenAI's first real price cut at the frontier tier, and it won't be the last. When Anthropic, Google, and Mistral respond (which they will), prices come down further.

Good time to be building on LLMs. The cost curve is your friend.

If you found this interesting, you might also like:

-- dataku

GPT-4 Turbo is 3x cheaper. Here's what that means for the API pricing war.

The updated master pricing table (November 2023)

The price drops in 2023, in order

What GPT-4 Turbo means for the competitive picture

The 128K context changes the economics too

The open source cost gap is narrowing

My updated pricing predictions

More from dataku

The inference cost collapse, in one chart

The AI API price tracker: 5 years of data in one interactive chart

Every AI pricing change in Q4 2025, tracked