2023 AI data roundup: the year the dam broke
GPT-4, Llama 2, Mistral, Claude 2, SDXL, and ChatGPT hitting 100M users. I compiled 20 charts that tell the story of 2023. This was the year AI stopped being a niche interest.
I've been writing these annual roundups since 2021. Those were easy. A dozen or so developments, a few interesting charts, done in a day.
2023 took me a week. My spreadsheet has over 400 data points for this year alone. The pace was relentless. Every month felt like it had enough developments for a normal year.
Here are the 20 numbers that tell the story.
The user adoption numbers
Chart 1: ChatGPT user growth
| Date | Monthly active users | Source | |------|---------------------|--------| | Dec 2022 | ~10M | SimilarWeb estimates | | Jan 2023 | ~57M | SimilarWeb | | Feb 2023 | ~100M | OpenAI announcement | | Apr 2023 | ~173M | SimilarWeb | | Jul 2023 | ~200M+ | SimilarWeb | | Oct 2023 | ~180M | SimilarWeb (slight decline) |
Sources: SimilarWeb, OpenAI announcements, press reporting.
100 million monthly active users by February. The fastest consumer product adoption in history, and it wasn't even close. That number settled around 180-200M MAU for the rest of the year. For a product that launched 13 months ago, sustained 180M+ MAU is remarkable.
Chart 2: AI product adoption beyond ChatGPT
| Product | Est. MAU (Dec 2023) | Launch date | Category | |---------|-------------------|-------------|----------| | ChatGPT | ~180M | Nov 2022 | Chatbot | | Midjourney | ~16M | Jul 2022 | Image gen | | Character.ai | ~12M | Sep 2022 | Chatbot | | Perplexity AI | ~10M | Jan 2023 | Search | | Claude (Anthropic) | ~5M | Mar 2023 | Chatbot | | Bard (Google) | ~50M | Mar 2023 | Chatbot | | Bing Chat | ~100M | Feb 2023 | Search |
Sources: SimilarWeb, Statista, press reporting, company announcements. All numbers approximate.
ChatGPT dominates, but it's not alone. Bing Chat and Bard have significant user bases because they're embedded in existing products. The standalone AI product market (Midjourney, Character.ai, Perplexity) is growing fast but still an order of magnitude smaller.
The model releases
Chart 3: Notable model releases by quarter
| Quarter | Total notable releases | Open source % | Closed % | |---------|----------------------|---------------|----------| | Q1 2023 | 58 | 62% | 38% | | Q2 2023 | 67 | 73% | 27% | | Q3 2023 | 72 | 78% | 22% | | Q4 2023 | 85+ (projected) | 81% | 19% | | Full year | 282+ | 74% | 26% |
Source: My tracking spreadsheet, cross-referenced with Hugging Face, arXiv, Papers With Code.
282 notable model releases in one year. Open source went from 62% in Q1 to 81% in Q4. The community is producing 4x more models per quarter than the big labs.
Chart 4: The major model timeline
| Date | Model | Parameters | Who | Open source? | |------|-------|-----------|-----|-------------| | Jan 10 | Flan-T5 XL | 3B | Google | Yes | | Feb 24 | LLaMA | 7-65B | Meta | Leaked | | Mar 1 | ChatGPT API (GPT-3.5-turbo) | Unknown | OpenAI | No | | Mar 13 | Stanford Alpaca | 7B | Stanford | Yes | | Mar 14 | GPT-4 | Unknown | OpenAI | No | | Mar 14 | Claude v1 | Unknown | Anthropic | No | | Mar 28 | Vicuna-13B | 13B | LMSYS | Yes | | Apr 16 | Stable Diffusion XL (beta) | 3.5B | Stability AI | Yes | | May 10 | PaLM 2 | Unknown | Google | No | | May 23 | Falcon 40B | 40B | TII | Yes | | Jul 11 | Claude 2 | Unknown | Anthropic | No | | Jul 18 | Llama 2 | 7-70B | Meta | Yes | | Aug 24 | Code Llama | 7-34B | Meta | Yes | | Sep 27 | Mistral 7B | 7.2B | Mistral AI | Yes | | Nov 6 | GPT-4 Turbo | Unknown | OpenAI | No | | Nov 21 | Claude 2.1 | Unknown | Anthropic | No | | Dec 6 | Gemini (Pro/Ultra announced) | Unknown | Google | No | | Dec 11 | Mixtral 8x7B | 46.7B | Mistral AI | Yes |
22 major releases. I could have listed 50 more. This table only includes models that genuinely moved the needle.
The benchmark progression
Chart 5: MMLU scores over time
| Date | Model | MMLU (5-shot) | Open source? | |------|-------|--------------|-------------| | Jan 2023 | GPT-3.5-turbo | 70.0% | No | | Feb 2023 | LLaMA 65B | 63.4% | Yes (leaked) | | Mar 2023 | GPT-4 | 86.4% | No | | May 2023 | Falcon 40B | 55.4% | Yes | | Jul 2023 | Llama 2 70B | 68.9% | Yes | | Sep 2023 | Mistral 7B | 60.1% | Yes | | Dec 2023 | Mixtral 8x7B | 70.6% | Yes | | Dec 2023 | Gemini Ultra | 90.0%* | No |
*Gemini Ultra uses CoT@32 methodology, not standard 5-shot. Standard 5-shot is 83.7%.
Sources: Model papers, Hugging Face Leaderboard, LMSYS, Epoch AI.
The open source MMLU trajectory: 63.4% (Feb) to 70.6% (Dec). A 7.2-point improvement in 10 months, crossing the GPT-3.5 threshold with Mixtral. The frontier also moved: from 70.0% (GPT-3.5, January) to 86.4% (GPT-4, March) to potentially 90% (Gemini Ultra, December).
Chart 6: LMSYS Chatbot Arena Elo changes
| Model | Elo (March 2023) | Elo (December 2023) | Change | |-------|------------------|--------------------|---------| | GPT-4 | ~1230 | ~1260 | +30 | | Claude 2.1 | N/A | ~1155 | New | | GPT-3.5-turbo | ~1120 | ~1130 | +10 | | Mixtral 8x7B | N/A | ~1120 | New | | Llama 2 70B-chat | N/A | ~1065 | New |
Source: LMSYS Chatbot Arena, approximate Elo scores at each date.
The pricing data
Chart 7: GPT-4 class pricing over 2023
| Date | Model/tier | Output $/1M tokens | Trend | |------|-----------|---------------------|-------| | Mar | GPT-4 8K | $60.00 | Launch | | Mar | GPT-4 32K | $120.00 | Premium tier | | Jul | Claude 2 | $24.00 | 60% cheaper than GPT-4 | | Nov | GPT-4 Turbo | $30.00 | 50% cut from GPT-4 | | Dec | Mixtral 8x7B (hosted) | $0.28-$0.60 | 98-99% cheaper |
Chart 8: GPT-3.5 class pricing over 2023
| Date | Model/option | Output $/1M tokens | vs Jan baseline | |------|-------------|---------------------|-----------------| | Jan | GPT-3.5-turbo | $2.00 | Baseline | | Jul | Llama 2 70B (Together AI) | $0.90 | -55% | | Sep | Mistral 7B (self-hosted) | $0.08 | -96% | | Nov | GPT-3.5-turbo (updated) | $2.00 | No change | | Dec | Mixtral 8x7B (self-hosted) | $0.18 | -91% |
Sources: OpenAI pricing, Anthropic pricing, Together AI, community calculations.
The cost of GPT-3.5-quality inference dropped 91-96% in 2023 if you use open source. The API price from OpenAI stayed flat. The gap between "use the API" and "host it yourself" went from 5x to 25x.
The funding data
Chart 9: Quarterly AI funding
| Quarter | Total AI funding | Generative AI % | # of rounds over $100M | |---------|-----------------|------------------|----------------------| | Q1 2023 | $12.4B* | 73% | 8 | | Q2 2023 | $4.8B | 68% | 5 | | Q3 2023 | $5.2B | 65% | 6 | | Q4 2023 | ~$6.5B (est.) | 62% | 7 | | Full year | ~$28.9B | ~67% | 26 |
*Includes the Microsoft/OpenAI $10B deal.
Sources: Crunchbase, PitchBook, CB Insights.
$28.9 billion in AI funding for 2023. Remove the Microsoft/OpenAI deal and it's $18.9B, which is still the highest non-OpenAI AI funding year ever.
Chart 10: Biggest rounds of 2023
| Company | Amount | Stage | Round date | |---------|--------|-------|------------| | OpenAI (Microsoft) | $10.0B | Growth | Jan 2023 | | Anthropic (Google) | $2.0B | Growth | Oct 2023 | | Anthropic (Amazon) | $1.25B (of $4B) | Growth | Sep 2023 | | Inflection AI | $1.3B | Series B | Jun 2023 | | Mistral AI | $415M | Series A | Dec 2023 | | Adept AI | $350M | Series B | Mar 2023 | | Cohere | $270M | Series C | Jun 2023 | | Hugging Face | $235M | Series D | Aug 2023 |
Sources: Company announcements, Crunchbase.
Anthropic alone raised over $6 billion in 2023 across multiple rounds. The AI lab funding is now measured in billions, not millions.
The open source data
Chart 11: Hugging Face growth
| Metric | Jan 2023 | Dec 2023 | Growth | |--------|----------|----------|--------| | Total models | ~120K | ~400K+ | +233% | | Monthly downloads | 195M | ~800M+ | +310% | | Registered users | ~280K | ~600K+ | +114% | | Organizations | ~15K | ~40K+ | +167% |
Source: Hugging Face public metrics.
Chart 12: LLaMA/Llama family tree
| Metric | Value (Dec 2023) | |--------|-----------------| | Total Llama-family derivatives | 200+ | | Most downloaded Llama derivative | TheBloke's GPTQ quantizations (~5M+ total) | | Fine-tuning papers citing Llama | 100+ on arXiv | | Countries with Llama-based startups | 25+ |
Source: Hugging Face, arXiv, my tracking.
The infrastructure data
Chart 13: GPU pricing trajectory
| GPU | Jan 2023 cloud price ($/hr) | Dec 2023 cloud price ($/hr) | Change | |-----|---------------------------|---------------------------|--------| | A100 80GB | $3.00 (AWS) | $1.50 (Lambda) | -50% | | H100 80GB | Not widely available | $2.49-$7.50 | New | | RTX 4090 (consumer) | $1,599 retail | $1,599 retail | Flat |
Chart 14: NVIDIA revenue
| Quarter | Data center revenue | QoQ growth | |---------|-------------------|-----------| | Q1 2023 (Apr) | $4.3B | +18% | | Q2 2023 (Jul) | $10.3B | +141% | | Q3 2023 (Oct) | $14.5B | +41% |
Source: NVIDIA earnings reports.
NVIDIA's data center revenue went from $4.3B to $14.5B in two quarters. That's the hardware demand story in one number.
The 6 numbers that matter most
If I had to pick just 6 numbers from all of 2023:
| # | Number | What it means | |---|--------|--------------| | 1 | 100M MAU | ChatGPT reached 100M users in 2 months (Feb 2023) | | 2 | 86.4% | GPT-4's MMLU score, the new frontier benchmark to beat | | 3 | 70.6% | Mixtral's MMLU, open source matching GPT-3.5 for the first time | | 4 | $0.18/M | Cost to run Mixtral (GPT-3.5 quality), down from $2.00 API price | | 5 | 282 | Notable model releases in 2023 (vs ~78 in 2022) | | 6 | $28.9B | Total AI funding in 2023 |
What 2023 meant
2022 was the year AI left the lab. 2023 was the year it became an industry.
The model quality ceiling rose (GPT-4), the open source floor rose faster (Llama 2, Mistral, Mixtral), prices fell by 10-90x depending on your tier, and hundreds of millions of people used AI products for the first time.
The data tells me 2023 was an inflection point, not a peak. The trends in open source quality, inference costs, and funding are all still accelerating. 2024 is going to be even wilder.
I need a bigger spreadsheet.
If you found this interesting, you might also like:
- My 2021 AI data roundup: the 10 numbers that mattered most
- 5 charts that explain why GPU prices went insane in 2021
- AI research papers published in 2021: a mid-year count
- Every model released in 2022 so far, in one table
- 2022 in AI data: the year everything accelerated
-- dataku