How much does it cost to run a chatbot with 1M daily users? I did the math.
1 million daily users, 5 messages each, average 300 tokens per response. At Claude 4 Sonnet pricing, that's $4,500/day. At GPT-4o mini, it's $300/day. I modeled the economics for 6 different model tiers.
I keep seeing startups pitch "AI chatbot for X" without knowing what it actually costs at scale. So I did the math.
Assumptions
| Parameter | Value | Rationale | |-----------|-------|-----------| | Daily active users | 1,000,000 | Mid-size consumer app | | Messages per user per day | 5 | Industry average for chatbots | | Avg input tokens per message | 150 | Short to medium queries | | Avg output tokens per message | 300 | ~225 words per response | | Total daily messages | 5,000,000 | 1M x 5 | | Total daily input tokens | 750M | 5M messages x 150 | | Total daily output tokens | 1,500M | 5M messages x 300 |
The cost per model tier
| Model | Input/M | Output/M | Daily input cost | Daily output cost | Daily total | Monthly | |-------|---------|----------|-----------------|------------------|-------------|---------| | Claude Opus 4 | $15.00 | $75.00 | $11,250 | $112,500 | $123,750 | $3.7M | | Claude 4 Sonnet | $3.00 | $15.00 | $2,250 | $22,500 | $24,750 | $742K | | GPT-4o | $2.50 | $10.00 | $1,875 | $15,000 | $16,875 | $506K | | Gemini 2.5 Pro | $1.25 | $10.00 | $938 | $15,000 | $15,938 | $478K | | GPT-4o mini | $0.15 | $0.60 | $113 | $900 | $1,013 | $30K | | Gemini 2.5 Flash | $0.15 | $0.60 | $113 | $900 | $1,013 | $30K |
Sources: Anthropic, OpenAI, Google, Together AI.
The range: $30K per month (Gemini Flash or GPT-4o mini) to $3.7M per month (Claude Opus 4).
That's a 123x cost difference for the same user base and the same conversation volume.
Self-hosted alternative
What if you run open source models instead?
| Setup | Hardware | Monthly hardware cost | Model | Effective cost/M tokens | |-------|----------|---------------------|-------|----------------------| | 8x A100 cluster | Rented (AWS) | $24,000 | Llama 4 Maverick | ~$0.08 output | | 8x H100 cluster | Rented (Lambda) | $18,000 | Llama 4 Maverick | ~$0.06 output | | 8x A100 (owned) | Purchased ($120K) | ~$3,400 (depreciation + power) | Llama 4 Maverick | ~$0.02 output |
| Self-hosted option | Daily cost (1M users) | Monthly cost | |-------------------|--------------------|-------------| | Rented A100s | ~$270 | ~$8,100 | | Rented H100s | ~$200 | ~$6,000 | | Owned hardware | ~$68 | ~$2,040 |
Sources: AWS, Lambda Labs, hardware pricing estimates.
Self-hosting Llama 4 Maverick on rented H100s costs about $6,000/month for 1M daily users. That's 5x cheaper than GPT-4o mini API and 80x cheaper than the GPT-4o API.
The catch: you need an ops team to maintain the infrastructure. At 1M users, you need redundancy, load balancing, and on-call engineers. The "hidden costs" of self-hosting add $5-15K/month in engineering time.
The cost per user per month
| Model tier | Monthly cost (1M users) | Cost per user per month | |-----------|------------------------|----------------------| | Flagship (Claude Opus 4) | $3,712,500 | $3.71 | | Premium (Claude 4 Sonnet) | $742,500 | $0.74 | | Standard (GPT-4o) | $506,250 | $0.51 | | Economy (GPT-4o mini) | $30,375 | $0.03 | | Self-hosted (Llama 4 Maverick) | ~$8,000 | $0.008 |
At the economy tier, it costs 3 cents per user per month. At the flagship tier, $3.71. For a subscription product charging $10-20/month, the economy models leave plenty of margin. The flagship models eat the entire subscription fee.
Breakeven analysis
| Pricing tier | Monthly revenue per user needed to break even (AI cost only) | |-------------|-------------------------------------------------------------| | Flagship | $3.71 (need $10+ subscription to be viable) | | Premium | $0.74 (viable at $5+ subscription) | | Economy API | $0.03 (viable even with ad-supported free tier) | | Self-hosted | $0.008 (essentially free per user) |
The economy models (GPT-4o mini, Gemini Flash) make AI chatbots viable as free, ad-supported products. At $0.03/user/month, even $0.50 CPM ad revenue more than covers AI costs.
Flagship models require paid subscriptions and careful usage limits. This is why ChatGPT Plus costs $20/month and still limits GPT-4 usage.
My take
The cost curve keeps dropping, and it changes what's possible at each price point. A year ago, the economy tier cost $0.30/user/month. Now it's $0.03. That 10x reduction opened the door for AI features in free consumer apps.
The question isn't "can you afford AI?" anymore. The question is "which tier gives you enough quality for your use case?"
My spreadsheet for this analysis has 47 rows and I keep adding "what if" scenarios. What if users send 10 messages instead of 5? (Double the cost.) What if you cache common responses? (30-50% savings.) What if token prices drop another 50% in 6 months? (They probably will.)
If you found this interesting, you might also like:
- Wait, GPT-3 costs HOW much per token?
- Codex and the cost of code generation: my first pricing analysis
- The cost of running an AI startup in 2022: a data breakdown
- Stable Diffusion is free. The pricing math of open source image generation.
- The LLM pricing war just started. Here's every provider's cost per token.
-- dataku