The Claude 3 model family pricing is actually brilliant. Here's why.
Haiku at $0.25/M tokens, Sonnet at $3, Opus at $15. Anthropic isn't just pricing models, they're pricing use cases. I compared the price-to-quality ratio across all three and the tiering makes perfect economic sense.
Anthropic didn't just release one model. They released three, and the pricing structure is the most interesting thing about the launch.
Haiku. Sonnet. Opus. Small, medium, large. Fast, balanced, best. $0.25, $3, $15 per million input tokens.
Most people looked at this and saw "cheap, medium, expensive." I looked at it and saw one of the smartest pricing strategies in AI.
The price-quality matrix
Let me lay it out:
| Model | Input $/M tokens | Output $/M tokens | MMLU | Speed (tokens/sec) | Context window | |-------|------------------|--------------------|------|--------------------|----| | Claude 3 Haiku | $0.25 | $1.25 | 75.2% | ~180 | 200K | | Claude 3 Sonnet | $3.00 | $15.00 | 79.0% | ~120 | 200K | | Claude 3 Opus | $15.00 | $75.00 | 86.8% | ~40 | 200K |
Sources: Anthropic pricing page, Anthropic Claude 3 technical report, my speed measurements.
The price jumps are dramatic. Sonnet costs 12x more than Haiku. Opus costs 5x more than Sonnet (60x more than Haiku). But the quality differences are not proportional to the price differences.
That's the whole point.
The quality-per-dollar analysis
Here's the math I ran. I used my standard 300-prompt evaluation on all three models, then calculated quality per dollar:
| Model | My overall score (1-5) | Output $/M tokens | Score per dollar spent | |-------|----------------------|--------------------|-----------------------| | Claude 3 Haiku | 3.41 | $1.25 | 2.728 | | Claude 3 Sonnet | 3.89 | $15.00 | 0.259 | | Claude 3 Opus | 4.22 | $75.00 | 0.056 | | GPT-4 Turbo | 4.07 | $30.00 | 0.136 | | GPT-3.5-turbo | 3.28 | $1.50 | 2.187 | | Mixtral 8x7B | 3.25 | $0.60 | 5.417 |
Source: My evaluation, 300 prompts each, blind rating, March 2024.
Haiku delivers 2.728 quality points per dollar. Opus delivers 0.056. That's a 49x difference in value efficiency.
But nobody buys Opus for value efficiency. They buy it for the last 0.81 points of quality (4.22 vs 3.41) that Haiku can't deliver. The premium is for the ceiling, not the floor.
The use case mapping (this is the clever part)
Anthropic isn't pricing models. They're pricing use cases. And the tiers map almost perfectly:
| Use case | Ideal tier | Why | Monthly cost at 10M tokens | |----------|-----------|-----|---------------------------| | Classification/routing | Haiku ($0.25 input) | Speed matters, quality threshold is low | $12.50 | | Customer support chat | Haiku or Sonnet | High volume, good-enough quality | $12.50 - $150 | | Content summarization | Sonnet ($3 input) | Needs understanding, not perfection | $150 | | Code generation | Sonnet or Opus | Accuracy matters for code | $150 - $750 | | Legal document analysis | Opus ($15 input) | Mistakes are expensive, quality is paramount | $750 | | Research synthesis | Opus ($15 input) | Needs deep reasoning and nuance | $750 | | Data extraction from forms | Haiku ($0.25 input) | Structured output, high volume | $12.50 | | Creative writing | Opus ($15 input) | Diminishing returns aren't an issue for quality | $750 |
I expected the tiers to feel like "budget, normal, premium." But when you map them to actual workloads, the pricing forces you to think about what quality level your task actually needs. Most tasks don't need Opus. The ones that do are exactly the ones where users will pay 60x more without blinking.
How Anthropic's tiering compares to OpenAI
OpenAI has a similar multi-tier approach, but the execution is different:
| Tier | Anthropic | OpenAI | Price ratio | |------|----------|--------|-------------| | Budget | Haiku ($0.25/$1.25) | GPT-3.5-turbo ($0.50/$1.50) | Haiku is 50% cheaper | | Mid | Sonnet ($3/$15) | GPT-4 Turbo ($10/$30) | Sonnet is 50-70% cheaper | | Premium | Opus ($15/$75) | GPT-4 ($30/$60) | Mixed (Opus input cheaper, output more expensive) |
Sources: Anthropic and OpenAI pricing pages, March 2024.
Anthropic is more aggressive on the low and mid tiers. Haiku at $0.25 input undercuts GPT-3.5-turbo by half. Sonnet at $3 input is significantly cheaper than GPT-4 Turbo at $10.
The Opus output pricing ($75/M tokens) is the most expensive in the industry. But it's positioned as a premium product where the buyer cares about quality, not cost. Anthropic is essentially saying: "If you need the best, here it is, and we'll charge accordingly."
The hidden strategy: Haiku as the gateway drug
I think Haiku is the most strategically important model in the lineup. Here's why.
At $0.25 per million input tokens, Haiku is cheap enough that developers will use it for tasks they currently handle with simple rules or regex. Classification, routing, basic extraction. These tasks don't need GPT-4. They barely need an LLM at all. But at $0.25/M tokens, the cost of using an LLM is so low that "why not?" becomes the answer.
Once developers build on Haiku, upgrading to Sonnet or Opus for harder tasks is a trivial code change (swap the model name in the API call). Same API. Same format. Same context window. The switching cost is literally one string.
| Developer journey | Model | Cost | Switching cost | |-------------------|-------|------|---------------| | Step 1: "Let me try Claude for this simple task" | Haiku | $12.50/month | Free | | Step 2: "This task needs better quality" | Sonnet | $150/month | Change one parameter | | Step 3: "This critical task needs the best" | Opus | $750/month | Change one parameter | | Step 4: "I'll use all three for different tasks" | All three | $912.50/month | Already done |
That's a textbook land-and-expand strategy. Get developers in with Haiku's price point, let them naturally discover use cases for the higher tiers. The 200K context window being the same across all three tiers removes the common "but the cheap model has limitations" objection.
One thing I didn't expect
I tested all three models on the same creative writing prompt: "Write a short story about a data analyst who discovers something unexpected in a spreadsheet."
Haiku gave me a competent 400-word story. Fine. Sonnet gave me a good 600-word story with better characterization. Opus gave me a 1,200-word story that made me feel something.
The quality difference between Haiku and Opus isn't just "more correct." It's "more human." The mid-tier Sonnet splits the difference. For tasks where "more human" matters (writing, advice, creative work), the premium is worth paying. For tasks where correct is correct (classification, extraction, formatting), Haiku is all you need.
That's the insight Anthropic built their pricing around, and I think it's exactly right.
If you found this interesting, you might also like:
- Wait, GPT-3 costs HOW much per token?
- Codex and the cost of code generation: my first pricing analysis
- The cost of running an AI startup in 2022: a data breakdown
- Stable Diffusion is free. The pricing math of open source image generation.
- The LLM pricing war just started. Here's every provider's cost per token.
-- dataku