Data StoriesDecember 26, 20236 min read

My 2023 prediction scorecard

I predicted open source would stay 2 years behind closed source. I was wrong by a lot. Llama 2 closed the gap in months. Here's my full scorecard for 2023.

Another year, another public reckoning with my own predictions.

If you missed it, I make 10 predictions every January and score them every December. My historical hit rate is 55-60%, which is barely better than a coin flip. Let's see if 2023 was different.

(It was not.)

The scorecard

| # | Prediction (from Dec 31, 2022) | Result | Score | |---|-------------------------------|--------|-------| | 1 | GPT-4 will launch in H1 2023 | Launched March 14. Nailed it. | Right | | 2 | At least 3 major tech companies will integrate ChatGPT-like features | Google (Bard), Microsoft (Bing Chat), Meta (embedded AI), Amazon (Bedrock), Snap, etc. Way more than 3. | Right | | 3 | OpenAI will cut API prices by at least 30% | GPT-4 Turbo is 50% cheaper than GPT-4. GPT-3.5-turbo input halved. Easily over 30%. | Right | | 4 | An open source model will match ChatGPT quality | Mixtral 8x7B matches or beats GPT-3.5-turbo on most benchmarks. Yes. | Right | | 5 | AI-generated image detection will become a major product category | Some products exist but it's not a "major" category. No breakout product. | Wrong | | 6 | Total AI VC funding will exceed $40B in 2023 | $28.9B including OpenAI deal. Without it, $18.9B. Neither exceeds $40B. | Wrong | | 7 | Midjourney will launch a standalone product outside Discord | Midjourney alpha web app announced but still primarily Discord. I'll call this half. | Half | | 8 | At least one country will pass AI-specific legislation | EU AI Act agreed in December 2023. Biden's Executive Order on AI. Yes. | Right | | 9 | The average person will be unable to tell AI text from human text | This is probably true in practice but hard to verify with data. I'll call it half. | Half | | 10 | Someone will train a model with over 100 trillion tokens | No confirmed model trained on 100T+ tokens. GPT-4's training data is unknown, but most estimates are well under 100T. | Wrong |

Final score: 5 right, 2 half-right, 3 wrong.

Year-over-year accuracy

| Year | Right | Half | Wrong | Hit rate (right + half*0.5) | |------|-------|------|-------|---------------------------| | 2020 | 5 | 2 | 3 | 60% | | 2021 | 4 | 4 | 2 | 60% | | 2022 | 4 | 3 | 3 | 55% | | 2023 | 5 | 2 | 3 | 60% |

Back to 60%. My ceiling, apparently. I'm a 60% accurate predictor of AI trends, which, to be fair, is better than most pundits but worse than I'd like.

Where I was right

Prediction 4 was my best call. In December 2022, I wrote "an open source model will match ChatGPT quality." At the time, the best open source model (BLOOM 176B) scored 28.3% on MMLU vs ChatGPT's 70%. The gap was enormous.

Twelve months later, Mixtral 8x7B scored 70.6% on MMLU. The gap closed to zero. I expected this to happen by late 2024 at the earliest. It happened in December 2023. I got the prediction right but the timeline wrong (it happened faster than I thought).

Prediction 8 was easy in retrospect. The EU AI Act and Biden's Executive Order were both significant AI-specific policy actions. The regulatory wheels were already turning in early 2023. This was the safest prediction on my list.

Where I was wrong

Prediction 6 was my biggest miss. I predicted $40B+ in AI funding. The actual number was $28.9B (or $18.9B without the OpenAI/Microsoft deal). I overestimated because I extrapolated Q1 2023's $12.4B quarterly rate for the full year. Q1 was an outlier driven by one massive deal.

Lesson: don't extrapolate outlier quarters. The funding pace normalized in Q2-Q4 to $4.8-6.5B per quarter, which is historically high but not the $12B rate I used for my projection.

Prediction 10 was overly ambitious. 100 trillion training tokens seemed plausible given the trajectory from 1.4T (LLaMA) to 2T (Llama 2). But the constraint isn't willingness to train on more data. It's the availability of high-quality data. At 100T tokens, you'd need to include vast amounts of low-quality web content, which empirically hurts model performance. Quality over quantity won.

The prediction I'm proudest of (that nobody believed)

Prediction 4, the open source one. When I made it in December 2022, multiple people told me I was being unrealistic. The gap between open source and ChatGPT seemed insurmountable.

What I saw that others didn't: the LLaMA leak was about to happen (I didn't predict the leak specifically, but I knew Meta was working on open models). Once a strong base model reached the community, the fine-tuning community would close the gap much faster than training from scratch would.

The community didn't need to match OpenAI's training budget. They just needed a good base model to fine-tune. LLaMA provided that, and Llama 2 made it official.

2024 predictions

Ten predictions for 2024, to be scored next December:

  1. GPT-5 (or equivalent next-gen OpenAI model) will launch in 2024
  2. An open source model will match GPT-4 on MMLU
  3. Total LLM API cost for GPT-4-quality will drop below $5/M output tokens
  4. At least one AI company valued at $1B+ will fail or be acquired at a major discount
  5. Context windows will exceed 1 million tokens for at least one commercial model
  6. AI-generated video will go from demo to product (publicly available, consumer-priced)
  7. Total AI VC funding will be $20-35B (I'm being less aggressive this time)
  8. The EU AI Act will cause at least one major AI product to restrict European access
  9. Mistral AI will become a top-5 AI company by market presence
  10. I will need to expand my tracking spreadsheet to over 500 models

The spreadsheet is ready. The prediction hubris is calibrated. See you in December 2024.


If you found this interesting, you might also like:

-- dataku

More from dataku