Model Size Visualizer

How big is 405 billion parameters compared to 7 billion? This tool makes the scale differences visible.

GPT-4OpenAI
1.8T
DeepSeek V3DeepSeek
685B
Llama 3.1 405BMeta
405B
Mistral LargeMistral
123B
Llama 3.1 70BMeta
70B
Gemma 2 9BGoogle
9.2B
Llama 3.1 8BMeta
8B
Mistral 7BMistral
7.3B
Phi-3 MiniMicrosoft
3.8B

Logarithmic scale. Differences between small models are more visible.

Size comparison

ModelProviderParametersvs. smallestReleased
GPT-4OpenAI1,760B463.2x2023
DeepSeek V3DeepSeek685B180.3x2024
Llama 3.1 405BMeta405B106.6x2024
Mistral LargeMistral123B32.4x2024
Llama 3.1 70BMeta70B18.4x2024
Gemma 2 9BGoogle9.2B2.4x2024
Llama 3.1 8BMeta8B2.1x2024
Mistral 7BMistral7.3B1.9x2023
Phi-3 MiniMicrosoft3.8B1x2024

What are parameters?

Parameters are the numerical values a model learns during training. Think of them as the model's "memory." More parameters generally means the model can store more knowledge and capture more subtle patterns in language.

But parameter count isn't everything. A 7B model trained on high-quality data with good techniques can outperform a 13B model trained poorly. Llama 3.1 8B, for example, beats many older 13B models on standard benchmarks.

The real question isn't "how many parameters?" but "how much capability per dollar?" That's where the LLM Cost Calculator comes in.