LLM Leaderboard - Comparison of over 100 AI models from OpenAI, Google, DeepSeek & others
Comparison and ranking the performance of over 100 AI models (LLMs) across key metrics including intelligence, price, performance and speed (output speed - tokens per second & latency - TTFT), context window & others.
For more details including relating to our methodology, see our FAQs.
Intelligence
Claude Opus 4.8 (max) and GPT-5.5 (xhigh) are the highest intelligence models, followed by GPT-5.5 (high) and Claude Opus 4.7 (max).
Output Speed
Mercury 2 and gpt-oss-120b (low) are the fastest models, followed by Granite 4.0 H Small and Granite 3.3 8B.
Latency
Command A+ and Qwen3.5 4B are the lowest latency models, followed by Gemini 2.5 Flash-Lite and NVIDIA Nemotron 3 Nano.
Price
Qwen3.5 0.8B and Qwen3.5 0.8B are the cheapest models, followed by Gemma 3n E4B and Nova Micro.
Context Window
Llama 4 Scout and Grok 4.20 0309 support the largest context windows, followed by Gemini 1.5 Pro (May) and Grok 4.1 Fast.
Further Analysis | ||||||||
|---|---|---|---|---|---|---|---|---|
Claude Opus 4.8 (max) | 1M | 61 | $4.10 | 54 | 8.02 | 17.20 | ||
GPT-5.5 (xhigh) | 922k | 60 | $4.35 | 57 | 68.29 | 76.99 | ||
GPT-5.5 (high) | 922k | 59 | $4.35 | 57 | 16.35 | 25.07 | ||
Claude Opus 4.7 (max) | 1M | 57 | $4.10 | 43 | 9.72 | 21.32 | ||
Gemini 3.1 Pro Preview | 1M | 57 | $1.74 | 139 | 19.85 | 23.45 | ||
GPT-5.5 (medium) | 922k | 57 | $4.35 | 54 | 6.65 | 15.89 | ||
Qwen3.7 Max | 1M | 57 | $1.43 | 187 | 2.60 | 18.19 | ||
Gemini 3.5 Flash | 1M | 55 | $1.31 | 164 | 18.79 | 21.83 | ||
Gemini 3.5 Flash (medium) | 1M | 55 | $1.31 | 173 | 13.85 | 16.74 | ||
Kimi K2.6 | 256k | 54 | $0.70 | 47 | 2.24 | 108.71 | ||
MiMo-V2.5-Pro | 1M | 54 | $0.18 | 50 | 2.93 | 53.09 | ||
GPT-5.3 Codex (xhigh) | 400k | 54 | $1.87 | 87 | 89.67 | 95.43 | ||
Grok 4.3 (high) | 1M | 53 | $0.64 | 143 | 13.63 | 17.13 | ||
Muse Spark | 262k | 52 | -- | -- | -- | -- | ||
Claude Opus 4.7 (Non-reasoning, high) | 1M | 52 | $4.10 | 42 | 1.25 | 13.08 | ||
Claude Sonnet 4.6 (max) | 1M | 52 | $2.46 | 45 | 101.78 | 112.93 | ||
DeepSeek V4 Pro (Max) | 1M | 52 | $0.18 | 46 | 1.81 | 108.56 | ||
GLM-5.1 | 200k | 51 | $0.90 | 58 | 1.51 | 75.80 | ||
GPT-5.5 (low) | 922k | 51 | $4.35 | 55 | 1.68 | 10.85 | ||
Qwen3.6 Plus | 1M | 50 | $0.43 | 53 | 2.96 | 117.04 | ||
DeepSeek V4 Pro (High) | 1M | 50 | $0.18 | 45 | 1.80 | 57.41 | ||
MiniMax-M2.7 | 205k | 50 | $0.22 | 101 | 2.74 | 32.01 | ||
MiMo-V2.5 | 1M | 49 | $0.06 | 93 | 2.83 | 29.67 | ||
GPT-5.4 mini (xhigh) | 400k | 49 | $0.65 | 164 | 4.06 | 7.11 | ||
Grok 4.3 (medium) | 1M | 49 | $0.64 | 146 | 7.08 | 10.50 | ||
GLM-5-Turbo | 200k | 47 | -- | -- | -- | -- | ||
DeepSeek V4 Flash (Max) | 1M | 47 | $0.06 | 128 | 1.22 | 48.88 | ||
DeepSeek V4 Flash (High) | 1M | 46 | $0.08 | -- | -- | -- | ||
Qwen3.6 27B | 262k | 46 | $0.90 | 53 | 3.85 | 120.68 | ||
Qwen3.5 397B A17B | 262k | 45 | $0.90 | 53 | 2.47 | 72.47 | ||
MiMo-V2-Omni-0327 | 256k | 45 | $0.34 | 94 | 2.75 | 29.38 | ||
Claude Sonnet 4.6 (Non-reasoning) | 1M | 44 | $2.46 | 42 | 1.15 | 12.92 | ||
GPT-5.4 nano (xhigh) | 400k | 44 | $0.18 | 152 | 3.78 | 7.07 | ||
Grok 4.3 (low) | 1M | 44 | $0.64 | 135 | 3.74 | 7.43 | ||
GLM-5.1 | 200k | 44 | $0.90 | 49 | 1.78 | 11.93 | ||
Qwen3.6 35B A3B | 262k | 43 | $0.37 | 171 | 2.38 | 17.01 | ||
MiMo-V2-Omni | 256k | 43 | $0.00 | 94 | 3.65 | 30.30 | ||
Gemini 3.5 Flash (minimal) | 1M | 43 | $1.31 | 169 | 0.89 | 3.85 | ||
Kimi K2.6 | 256k | 43 | $0.70 | 45 | 2.48 | 13.64 | ||
GLM 5V Turbo | 200k | 43 | -- | -- | -- | -- | ||
Claude Sonnet 4.6 (Non-reasoning, Low Effort) | 1M | 43 | $2.46 | 43 | 1.28 | 12.87 | ||
Hy3-preview | 256k | 42 | $0.10 | 94 | 4.01 | 30.49 | ||
GPT-5.5 Instant (May 2026) | 400k | 42 | $4.35 | -- | -- | -- | ||
Qwen3.5 122B A10B | 262k | 42 | $0.68 | 140 | 2.46 | 20.36 | ||
MiMo-V2-Flash (Feb 2026) | 256k | 41 | $0.06 | 135 | 2.08 | 20.57 | ||
GPT-5.5 (Non-reasoning) | 922k | 41 | $4.35 | 54 | 0.90 | 10.16 | ||
Qwen3.5 397B A17B | 262k | 40 | $0.90 | 53 | 2.48 | 11.85 | ||
DeepSeek V4 Pro | 1M | 39 | $0.18 | 43 | 1.79 | 13.43 | ||
Mistral Medium 3.5 | 256k | 39 | $2.10 | 150 | 2.12 | 18.78 | ||
Gemma 4 31B | 256k | 39 | $0.00 | 35 | 1.09 | 64.71 | ||
Qwen3.5 Omni Plus | 256k | 39 | $0.84 | 55 | 2.45 | 11.57 | ||
Step 3.5 Flash 2603 | 256k | 38 | $0.00 | 204 | 1.20 | 13.44 | ||
Ring-2.6-1T | 262k | 38 | $0.52 | 121 | 3.30 | 23.99 | ||
o3 | 200k | 38 | $1.55 | 134 | 5.06 | 8.78 | ||
GPT-5.4 nano | 400k | 38 | $0.18 | 148 | 2.89 | 6.26 | ||
GPT-5.4 mini (medium) | 400k | 38 | $0.65 | 162 | 3.42 | 6.51 | ||
Command A+ | 192k | 37 | $0.00 | 199 | 0.29 | 12.85 | ||
Qwen3.6 27B | 262k | 37 | $0.90 | 58 | 3.87 | 12.45 | ||
Claude 4.5 Haiku | 200k | 37 | $0.82 | 94 | 23.86 | 29.18 | ||
DeepSeek V4 Flash | 1M | 36 | $0.06 | 123 | 1.25 | 5.31 | ||
JT-35B-Flash | 256k | 36 | -- | -- | -- | -- | ||
NVIDIA Nemotron 3 Super | 1M | 36 | $0.28 | 219 | 1.87 | 13.26 | ||
Qwen3.5 122B A10B | 262k | 36 | $0.68 | 162 | 2.52 | 5.60 | ||
Nova 2.0 Pro Preview (medium) | 256k | 36 | $1.47 | 117 | 10.97 | 32.29 | ||
MiMo-V2.5-Pro | 1M | 36 | $0.58 | 48 | 3.24 | 13.72 | ||
Gemini 2.5 Pro | 1M | 35 | $1.34 | 130 | 22.09 | 25.95 | ||
Nova 2.0 Lite (high) | 1M | 35* | $0.52 | 147 | 16.04 | 33.00 | ||
Hy3-preview | 256k | 34 | $0.10 | 88 | 4.10 | 9.79 | ||
Ling-2.6-1T | 262k | 34 | $0.52 | -- | -- | -- | ||
Doubao Seed Code | 256k | 34 | -- | -- | -- | -- | ||
Gemini 3.1 Flash-Lite | 1M | 34 | $0.22 | 282 | 5.30 | 7.07 | ||
gpt-oss-120b (high) | 131k | 33 | $0.20 | 341 | 0.85 | 8.18 | ||
Mercury 2 | 128k | 33 | $0.14 | 785 | 2.66 | 3.30 | ||
Qwen3.5 9B | 262k | 32 | $0.11 | 76 | 2.27 | 35.31 | ||
Gemma 4 31B | 256k | 32 | $0.17 | 18 | 1.44 | 28.79 | ||
K-EXAONE | 256k | 32 | -- | -- | -- | -- | ||
Nova 2.0 Pro Preview (low) | 256k | 32 | $2.13 | 119 | 10.41 | 31.46 | ||
Trinity Large Thinking | 512k | 32 | $0.24 | 153 | 1.14 | 17.50 | ||
Qwen3.6 35B A3B | 262k | 32 | $0.56 | 179 | 2.41 | 5.21 | ||
Gemma 4 26B A4B | 256k | 31 | $0.14 | -- | -- | -- | ||
Claude 4.5 Haiku | 200k | 31 | $0.82 | 92 | 0.76 | 6.22 | ||
Grok 4.3 | 1M | 31 | $0.64 | 109 | 0.63 | 5.21 | ||
Qwen3.5 35B A3B | 262k | 31 | $0.42 | 149 | 2.30 | 5.66 | ||
MiMo-V2-Flash | 256k | 30 | $0.12 | 130 | 2.36 | 6.21 | ||
EXAONE 4.5 33B | 262k | 30 | -- | -- | -- | -- | ||
Nova 2.0 Lite (medium) | 1M | 30 | $0.52 | 151 | 21.81 | 38.35 | ||
ERNIE 5.0 Thinking Preview | 128k | 29 | -- | -- | -- | -- | ||
Nemotron Cascade 2 30B A3B | 1M | 28 | -- | -- | -- | -- | ||
Qwen3 Coder Next | 256k | 28 | $0.43 | 126 | 1.66 | 5.64 | ||
Nova 2.0 Omni (medium) | 1M | 28 | $0.52 | -- | -- | -- | ||
Mistral Small 4 | 256k | 28 | $0.20 | 179 | 0.72 | 14.70 | ||
Qwen3.5 9B | 262k | 27 | -- | -- | -- | -- | ||
Magistral Medium 1.2 | 128k | 27 | $2.30 | 39 | 1.72 | 66.17 | ||
Gemma 4 26B A4B | 256k | 27 | $0.16 | 77 | 1.60 | 8.13 | ||
Qwen3.5 4B | 262k | 27 | $0.04 | 196 | 0.39 | 13.16 | ||
Qwen3 Next 80B A3B | 262k | 27 | $1.05 | 159 | 2.33 | 18.10 | ||
Ling 2.6 Flash | 262k | 26 | $0.06 | -- | -- | -- | ||
Solar Pro 3 | 128k | 26 | -- | -- | -- | -- | ||
Qwen3.5 Omni Flash | 256k | 26 | $0.17 | 232 | 1.90 | 4.06 | ||
JT-MINI | 128k | 25 | -- | -- | -- | -- | ||
Nova 2.0 Lite (low) | 1M | 25 | $0.52 | 158 | 10.31 | 26.15 | ||
gpt-oss-20B (high) | 131k | 24 | $0.07 | 229 | 0.74 | 11.65 | ||
gpt-oss-120b (low) | 131k | 24 | $0.20 | 364 | 0.86 | 7.73 | ||
GPT-5.4 nano | 400k | 24 | $0.18 | 156 | 0.60 | 3.81 | ||
NVIDIA Nemotron 3 Nano | 1M | 24 | $0.07 | 132 | 1.69 | 20.61 | ||
LongCat Flash Lite | 256k | 24 | $0.00 | -- | -- | -- | ||
K-EXAONE | 256k | 23 | -- | -- | -- | -- | ||
GPT-5.4 mini | 400k | 23 | $0.65 | 153 | 0.62 | 3.88 | ||
Nova 2.0 Omni (low) | 1M | 23 | $0.52 | -- | -- | -- | ||
Nova 2.0 Pro Preview | 256k | 23 | $2.13 | 124 | 1.08 | 5.13 | ||
Mi:dm K 2.5 Pro | 128k | 23 | -- | -- | -- | -- | ||
Mistral Large 3 | 256k | 23 | $0.60 | 53 | 1.08 | 10.48 | ||
Qwen3.5 4B | 262k | 23 | $0.04 | 203 | 0.43 | 2.90 | ||
INTELLECT-3 | 131k | 22 | -- | -- | -- | -- | ||
Devstral 2 | 256k | 22 | $0.00 | 78 | 1.07 | 7.49 | ||
Solar Open 100B | 128k | 22 | -- | -- | -- | -- | ||
Nemotron 3 Nano Omni 30B A3B Reasoning | 256k | 21 | $0.10 | 290 | 1.05 | 9.67 | ||
gpt-oss-20B (low) | 131k | 21 | $0.07 | 243 | 0.80 | 11.11 | ||
Qwen3 Next 80B A3B | 262k | 20 | $0.65 | 150 | 2.26 | 5.59 | ||
Devstral Small 2 | 256k | 19 | $0.00 | 77 | 1.14 | 7.67 | ||
Motif-2-12.7B | 128k | 19 | -- | -- | -- | -- | ||
Nova Premier | 1M | 19 | $2.18 | 35 | 2.88 | 17.27 | ||
Gemma 4 E4B | 128k | 19 | -- | -- | -- | -- | ||
Llama Nemotron Super 49B v1.5 | 128k | 19 | $0.13 | 47 | 1.32 | 54.40 | ||
Mistral Small 4 | 256k | 19 | $0.20 | 157 | 0.70 | 3.88 | ||
Llama 4 Maverick | 1M | 18 | $0.34 | 114 | 0.95 | 5.34 | ||
Magistral Small 1.2 | 128k | 18 | $0.60 | 108 | 0.81 | 23.90 | ||
Sarvam 105B (high) | 128k | 18 | $0.04 | 95 | 2.09 | 28.34 | ||
Nova 2.0 Lite | 1M | 18 | $0.52 | 150 | 1.32 | 4.64 | ||
MiniCPM5-1B | 128k | 18 | -- | -- | -- | -- | ||
Llama 3.1 405B | 128k | 17 | $3.13 | 35 | 2.37 | 16.68 | ||
EXAONE 4.0 32B | 131k | 17 | -- | -- | -- | -- | ||
Nova 2.0 Omni | 1M | 17 | $0.52 | -- | -- | -- | ||
Qwen3.5 2B | 262k | 16 | $0.03 | -- | -- | -- | ||
Nanbeige4.1-3B | 256k | 16 | -- | -- | -- | -- | ||
Ministral 3 14B | 256k | 16 | $0.20 | 90 | 0.77 | 6.30 | ||
Falcon-H1R-7B | 256k | 16 | -- | -- | -- | -- | ||
Qwen3 Omni 30B A3B | 65.5k | 16 | $0.32 | 86 | 1.95 | 30.96 | ||
Step3 VL 10B | 65.5k | 15 | -- | -- | -- | -- | ||
Gemma 4 E2B | 128k | 15 | -- | -- | -- | -- | ||
Llama Nemotron Ultra | 128k | 15 | $0.72 | 52 | 2.41 | 50.53 | ||
ERNIE 4.5 300B A47B | 131k | 15 | $0.36 | 25 | 3.50 | 23.77 | ||
Solar Pro 2 | 65.5k | 15 | -- | -- | -- | -- | ||
NVIDIA Nemotron Nano 12B v2 VL | 128k | 15 | $0.24 | -- | -- | -- | ||
Ministral 3 8B | 256k | 15 | $0.15 | 99 | 0.65 | 5.68 | ||
Gemma 4 E4B | 128k | 15 | -- | -- | -- | -- | ||
NVIDIA Nemotron Nano 9B V2 | 131k | 15 | $0.05 | 120 | 0.70 | 21.61 | ||
Granite 4.1 30B | 131k | 15 | -- | -- | -- | -- | ||
NVIDIA Nemotron 3 Nano 4B | 262k | 15 | -- | -- | -- | -- | ||
Qwen3.5 2B | 262k | 15 | $0.03 | 254 | 0.41 | 2.38 | ||
Llama Nemotron Super 49B v1.5 | 128k | 15 | $0.13 | 48 | 1.27 | 11.66 | ||
Llama 3.3 70B | 128k | 14 | $0.60 | 84 | 1.60 | 7.59 | ||
Kimi Linear 48B A3B Instruct | 1M | 14* | -- | -- | -- | -- | ||
Ring-flash-2.0 | 128k | 14 | $0.18 | -- | -- | -- | ||
Solar Pro 2 | 65.5k | 14 | -- | -- | -- | -- | ||
Llama 4 Scout | 10M | 14 | $0.22 | 108 | 0.86 | 5.49 | ||
Command A | 256k | 13 | $3.25 | 53 | 1.81 | 11.17 | ||
Llama 3.1 Nemotron 70B | 128k | 13 | $1.20 | 292 | 0.51 | 2.22 | ||
NVIDIA Nemotron 3 Nano | 1M | 13 | $0.07 | 90 | 0.41 | 5.98 | ||
NVIDIA Nemotron Nano 9B V2 | 131k | 13 | $0.06 | 146 | 1.03 | 4.44 | ||
MiniCPM-V 4.6 1.3B | 262k | 13 | -- | -- | -- | -- | ||
Granite 4.1 8B | 131k | 12 | $0.06 | 112 | 0.81 | 5.28 | ||
Sarvam 30B (high) | 65.5k | 12 | $0.03 | 164 | 1.94 | 17.21 | ||
Gemma 4 E2B | 128k | 12 | -- | -- | -- | -- | ||
R1 1776 | 128k | 12* | -- | -- | -- | -- | ||
Llama 3.2 90B (Vision) | 128k | 12* | $1.38 | 58 | 1.18 | 9.74 | ||
EXAONE 4.0 32B | 131k | 12 | -- | -- | -- | -- | ||
Ministral 3 3B | 256k | 11 | $0.10 | 187 | 0.49 | 3.16 | ||
Jamba 1.7 Large | 256k | 11 | $2.60 | 62 | 1.54 | 9.56 | ||
Granite 4.0 H Small | 128k | 11 | $0.08 | 352 | 10.29 | 11.72 | ||
Qwen3 Omni 30B A3B | 65.5k | 11 | $0.32 | 97 | 2.04 | 7.21 | ||
Qwen3.5 0.8B | 262k | 11 | $0.01 | -- | -- | -- | ||
LFM2 24B A2B | 32.8k | 10 | $0.04 | 121 | 0.64 | 4.77 | ||
Phi-4 | 16k | 10 | $0.16 | 38 | 2.00 | 15.11 | ||
Nova Micro | 130k | 10 | $0.03 | 296 | 0.91 | 2.60 | ||
NVIDIA Nemotron Nano 12B v2 VL | 128k | 10 | $0.24 | 225 | 1.07 | 3.29 | ||
Phi-4 Multimodal | 128k | 10* | $0.00 | 17 | 0.87 | 31.04 | ||
Qwen3.5 0.8B | 262k | 10 | $0.01 | 79 | 0.42 | 6.77 | ||
Jamba Reasoning 3B | 262k | 10 | -- | -- | -- | -- | ||
Reka Flash 3 | 128k | 10 | $0.26 | -- | -- | -- | ||
Ling-mini-2.0 | 131k | 9 | -- | -- | -- | -- | ||
Llama 3.2 11B (Vision) | 128k | 9 | $0.25 | 52 | 0.70 | 10.27 | ||
Granite 4.1 3B | 131k | 9 | -- | -- | -- | -- | ||
Phi-4 Mini | 128k | 8 | $0.00 | -- | -- | -- | ||
Exaone 4.0 1.2B | 64k | 8 | -- | -- | -- | -- | ||
Exaone 4.0 1.2B | 64k | 8 | -- | -- | -- | -- | ||
LFM2.5-1.2B-Thinking | 32k | 8 | -- | -- | -- | -- | ||
Jamba 1.7 Mini | 258k | 8 | -- | -- | -- | -- | ||
LFM2 2.6B | 32.8k | 8 | $0.00 | -- | -- | -- | ||
LFM2.5-1.2B-Instruct | 32k | 8 | $0.00 | -- | -- | -- | ||
Granite 4.0 H 1B | 128k | 8 | -- | -- | -- | -- | ||
Gemma 3 270M | 32k | 8 | -- | -- | -- | -- | ||
Apertus 70B Instruct | 65.5k | 8 | $1.03 | -- | -- | -- | ||
Granite 4.0 Micro | 128k | 8 | -- | -- | -- | -- | ||
Granite 4.0 1B | 128k | 7 | -- | -- | -- | -- | ||
LFM2 8B A1B | 32.8k | 7 | $0.00 | -- | -- | -- | ||
LFM2.5-VL-1.6B | 32k | 6 | $0.00 | -- | -- | -- | ||
Granite 4.0 350M | 32.8k | 6 | -- | -- | -- | -- | ||
Apertus 8B Instruct | 65.5k | 6 | $0.11 | -- | -- | -- | ||
Granite 4.0 H 350M | 32.8k | 5 | -- | -- | -- | -- | ||
Tiny Aya Global | 8.19k | 5 | $0.00 | -- | -- | -- | ||
EXAONE 4.5 33B | 262k | -- | -- | -- | -- | -- | ||
Gemini 3 Deep Think | 128k | -- | -- | -- | -- | -- | ||
Mi:dm K 2.5 Pro Preview | 128k | -- | -- | -- | -- | -- | ||
GPT-5.5 Pro (xhigh) | 922k | -- | -- | -- | -- | -- | ||
Key definitions
Frequently Asked Questions
Claude Opus 4.8 (Adaptive Reasoning, Max Effort) currently ranks #1 on the Artificial Analysis LLM Leaderboard with an Intelligence Index score of 61, out of 370 models ranked.
The top models by Intelligence Index are: 1. Claude Opus 4.8 (Adaptive Reasoning, Max Effort) (61), 2. GPT-5.5 (xhigh) (60), 3. GPT-5.5 (high) (59), 4. Claude Opus 4.7 (Adaptive Reasoning, Max Effort) (57), 5. Gemini 3.1 Pro Preview (57).
Mercury 2 is the fastest at 784.9 tokens per second, followed by gpt-oss-120b (low) (363.8 t/s) and Granite 4.0 H Small (351.8 t/s).
Qwen3.5 0.8B (Non-reasoning) is the most affordable at $0.01 per 1M tokens (blended 7:2:1 cache hit/input/output ratio), followed by Qwen3.5 0.8B (Reasoning) ($0.01) and Gemma 3n E4B Instruct ($0.02).
Kimi K2.6 is the highest-ranked open weights model with an Intelligence Index score of 54. There are 227 open weights models out of 370 total on the leaderboard.
The top open weights models by Intelligence Index are: 1. Kimi K2.6 (54), 2. MiMo-V2.5-Pro (54), 3. DeepSeek V4 Pro (Reasoning, Max Effort) (52).
Claude Opus 4.8 (Adaptive Reasoning, Max Effort) leads among 190 reasoning models with an Intelligence Index score of 61. Reasoning models use extended thinking to solve complex problems before responding.
The leaderboard includes filters to narrow results by model type (reasoning vs non-reasoning), openness (open weights vs proprietary), and other criteria. You can also adjust prompt options to see how performance varies with different input lengths.
Click on any model name in the leaderboard to visit its dedicated comparison page with detailed charts covering intelligence, pricing, speed, latency, and more. You can also compare API providers for each model. View all models