About dataku

I have a problem. I can't stop looking at AI benchmark data.

My morning starts with checking pricing tables. My lunch break disappears into model comparison spreadsheets. And I fall asleep wondering why Claude costs 5x more than GPT-4o-mini but only scores 12% higher on MMLU. That kind of thing keeps me up.

Before this, I spent years as a data analyst in Tokyo. Numbers were my job. Patterns were my obsession. When AI started moving fast enough that even the benchmarks couldn't keep up, I knew somebody needed to sit down and actually track what was happening. Not the hype. Not the press releases. The data.

That's dataku.

What I cover

  • Benchmark Analysis– What the scores actually mean, where models excel, where they quietly fall short
  • Pricing Watch– Tracking cost per token across providers, spotting kaizen moments before they become obvious
  • Model Comparisons– Head-to-head breakdowns with real data, not vibes
  • Data Engineering– The infrastructure side of AI that doesn't get enough attention
  • Industry Trends– Where the data says the industry is going (vs. where the hype says)

My data philosophy

There's a Japanese concept called ikigai. Roughly, it means "reason for being." Mine turned out to be spreadsheets. I'm only half joking.

I believe every number deserves context. A benchmark score on its own tells you nothing. The same score compared to last quarter, plotted against cost per token, and filtered by task type? Now you're getting somewhere.

I also believe in showing my work. Every claim here has a source. Every table links to where the data came from. If I'm guessing, I'll say so. And when I get it wrong (it happens), I'll correct it publicly.

The data

Benchmark and pricing data comes from official sources: model papers, provider documentation, arxiv preprints, and public benchmark leaderboards. I cross-reference everything. If a number appears here, it has a source.

Contact

Got a data question about AI? Found an error in my analysis? (It happens. I once miscounted GPT-4's benchmark wins and had to redo an entire article.) Reach out at hello@dataku.ai.

— dataku