LLM Leaderboard - Comparison of over 100 AI models from OpenAI, Google, DeepSeek & others
Comparison and ranking the performance of over 100 AI models (LLMs) across key metrics including intelligence, price, performance and speed (output speed - tokens per second & latency - TTFT), context window & others.
For more details including relating to our methodology, see our FAQs.
Intelligence
Claude Opus 4.7 (max) and Gemini 3.1 Pro Preview are the highest intelligence models, followed by GPT-5.4 (xhigh) and Kimi K2.6.
Output Speed
Mercury 2 and Granite 4.0 H Small are the fastest models, followed by Granite 3.3 8B and Gemini 3.1 Flash-Lite Preview.
Latency
Ministral 3 3B and Qwen3.5 2B are the lowest latency models, followed by LFM2 24B A2B and Qwen3.5 0.8B.
Price
Qwen3.5 0.8B and Qwen3.5 0.8B are the cheapest models, followed by Gemma 3n E4B and Qwen3.5 2B.
Context Window
Llama 4 Scout and Grok 4.1 Fast support the largest context windows, followed by Gemini 1.5 Pro (May) and Gemini 2.0 Flash Thinking exp. (Dec).
Further Analysis | ||||||||
|---|---|---|---|---|---|---|---|---|
Claude Opus 4.7 (max) | 1M | 57 | $10.00 | 44 | 19.67 | 31.15 | ||
Gemini 3.1 Pro Preview | 1M | 57 | $4.50 | 133 | 31.50 | 35.27 | ||
GPT-5.4 (xhigh) | 1.05M | 57 | $5.63 | 78 | 228.22 | 234.65 | ||
Kimi K2.6 | 256k | 54 | $1.71 | 134 | 1.04 | 37.97 | ||
GPT-5.3 Codex (xhigh) | 400k | 54 | $4.81 | 73 | 101.92 | 108.82 | ||
Claude Opus 4.6 (max) | 1M | 53 | $10.00 | 41 | 10.07 | 22.13 | ||
Muse Spark | 262k | 52 | -- | -- | -- | -- | ||
Claude Opus 4.7 (high) | 1M | 52 | $10.00 | 38 | 1.67 | 14.69 | ||
Qwen3.6 Max Preview | 256k | 52 | $2.92 | 58 | 3.29 | 46.70 | ||
Claude Sonnet 4.6 (max) | 1M | 52 | $6.00 | 57 | 135.28 | 144.04 | ||
GLM-5.1 | 200k | 51 | $2.15 | 47 | 1.61 | 93.33 | ||
Qwen3.6 Plus | 1M | 50 | $1.13 | 53 | 2.84 | 116.95 | ||
GLM-5 | 200k | 50 | $1.55 | 61 | 1.77 | 60.79 | ||
MiniMax-M2.7 | 205k | 50 | $0.53 | 47 | 2.47 | 65.94 | ||
Grok 4.20 0309 v2 | 2M | 49 | $3.00 | 185 | 13.44 | 16.14 | ||
MiMo-V2-Pro | 1M | 49 | $1.50 | 64 | 3.24 | 56.92 | ||
GPT-5.4 mini (xhigh) | 400k | 49 | $1.69 | 165 | 7.12 | 10.15 | ||
Kimi K2.5 | 256k | 47 | $1.20 | 37 | 3.08 | 97.14 | ||
GLM-5-Turbo | 200k | 47 | -- | -- | -- | -- | ||
Claude Opus 4.6 (high) | 1M | 46 | $10.00 | 38 | 1.82 | 14.81 | ||
Gemini 3 Flash | 1M | 46 | $1.13 | 161 | 7.86 | 10.96 | ||
Qwen3.5 397B A17B | 262k | 45 | $1.35 | 53 | 2.57 | 72.41 | ||
MiMo-V2-Omni-0327 | 256k | 45 | -- | -- | -- | -- | ||
Claude Sonnet 4.6 (Non-reasoning) | 1M | 44 | $6.00 | 44 | 1.37 | 12.84 | ||
GPT-5.4 nano (xhigh) | 400k | 44 | $0.46 | 153 | 4.80 | 8.06 | ||
GLM-5.1 | 200k | 44 | $2.15 | 41 | 2.33 | 14.54 | ||
Qwen3.6 35B A3B | 262k | 43 | $0.84 | 217 | 2.61 | 14.14 | ||
MiMo-V2-Omni | 256k | 43 | -- | -- | -- | -- | ||
GLM 5V Turbo | 200k | 43 | -- | -- | -- | -- | ||
Claude Sonnet 4.6 (Non-reasoning, Low Effort) | 1M | 43 | $6.00 | 46 | 1.52 | 12.46 | ||
Qwen3.5 27B | 262k | 42 | $0.82 | 85 | 5.56 | 34.83 | ||
DeepSeek V3.2 | 128k | 42 | $0.32 | 28 | 2.15 | 91.02 | ||
Qwen3.5 122B A10B | 262k | 42 | $1.10 | 152 | 2.43 | 18.92 | ||
MiMo-V2-Flash (Feb 2026) | 256k | 41 | $0.15 | 132 | 2.09 | 20.97 | ||
Gemini 3 Pro Preview (low) | 1M | 41 | $4.50 | -- | -- | -- | ||
GLM-5 | 200k | 41 | $1.55 | 53 | 2.21 | 11.68 | ||
Qwen3.5 397B A17B | 262k | 40 | $1.35 | 53 | 2.56 | 12.04 | ||
Qwen3 Max Thinking | 256k | 40 | $2.40 | 37 | 4.04 | 70.94 | ||
Gemma 4 31B | 256k | 39 | $0.00 | 36 | 1.74 | 71.77 | ||
Qwen3.5 Omni Plus | 256k | 39 | $1.50 | 56 | 2.49 | 11.47 | ||
Grok 4.1 Fast | 2M | 39 | $0.28 | 149 | 6.59 | 9.94 | ||
Step 3.5 Flash 2603 | 256k | 38 | $0.00 | 173 | 1.10 | 15.52 | ||
o3 | 200k | 38 | $3.50 | 89 | 10.41 | 16.05 | ||
GPT-5.4 nano | 400k | 38 | $0.46 | 160 | 4.03 | 7.16 | ||
Step 3.5 Flash | 256k | 38 | $0.15 | 173 | 1.03 | 15.48 | ||
GPT-5.4 mini (medium) | 400k | 38 | $1.69 | 169 | 5.66 | 8.62 | ||
Kimi K2.5 | 256k | 37 | $1.20 | 34 | 3.02 | 17.93 | ||
Qwen3.5 27B | 262k | 37 | $0.82 | 87 | 5.91 | 11.69 | ||
Qwen3.5 35B A3B | 262k | 37 | $0.69 | 176 | 2.22 | 16.44 | ||
Claude 4.5 Haiku | 200k | 37 | $2.00 | 100 | 19.19 | 24.20 | ||
NVIDIA Nemotron 3 Super | 1M | 36 | $0.41 | 153 | 1.23 | 17.59 | ||
Qwen3.5 122B A10B | 262k | 36 | $1.10 | 157 | 2.42 | 5.60 | ||
Nova 2.0 Pro Preview (medium) | 256k | 36 | $3.44 | 120 | 11.75 | 32.51 | ||
GPT-5.4 (Non-reasoning) | 1.05M | 35 | $5.63 | 58 | 0.88 | 9.48 | ||
Gemini 3 Flash | 1M | 35 | $1.13 | 186 | 1.39 | 4.08 | ||
Gemini 2.5 Pro | 1M | 35 | $3.44 | 128 | 24.78 | 28.70 | ||
Nova 2.0 Lite (high) | 1M | 35* | $0.85 | 125 | 24.38 | 44.31 | ||
Gemini 3.1 Flash-Lite Preview | 1M | 34 | $0.56 | 338 | 5.39 | 6.86 | ||
Doubao Seed Code | 256k | 34 | -- | -- | -- | -- | ||
gpt-oss-120B (high) | 131k | 33 | $0.26 | 216 | 0.89 | 12.45 | ||
Mercury 2 | 128k | 33 | $0.38 | 707 | 4.22 | 4.92 | ||
Qwen3.5 9B | 262k | 32 | $0.11 | 48 | 0.75 | 52.83 | ||
Gemma 4 31B | 256k | 32 | -- | -- | -- | -- | ||
K-EXAONE | 256k | 32 | -- | -- | -- | -- | ||
DeepSeek V3.2 | 128k | 32 | $0.32 | 28 | 2.16 | 19.89 | ||
Grok 3 mini Reasoning (high) | 1M | 32 | $0.35 | 208 | 0.65 | 12.64 | ||
Nova 2.0 Pro Preview (low) | 256k | 32 | $3.44 | 126 | 11.08 | 30.86 | ||
Trinity Large Thinking | 512k | 32 | $0.40 | 131 | 1.00 | 20.08 | ||
Qwen3.6 35B A3B | 262k | 32 | $0.84 | 192 | 2.43 | 5.03 | ||
Gemma 4 26B A4B | 256k | 31 | $0.20 | -- | -- | -- | ||
Claude 4.5 Haiku | 200k | 31 | $2.00 | 90 | 0.67 | 6.25 | ||
Qwen3.5 35B A3B | 262k | 31 | $0.69 | 161 | 2.34 | 5.45 | ||
MiMo-V2-Flash | 256k | 30 | $0.15 | 126 | 2.56 | 6.54 | ||
Nova 2.0 Lite (medium) | 1M | 30 | $0.85 | 153 | 14.32 | 30.67 | ||
DeepSeek V3.2 Speciale | 128k | 29 | -- | -- | -- | -- | ||
ERNIE 5.0 Thinking Preview | 128k | 29 | -- | -- | -- | -- | ||
Grok 4.20 0309 v2 | 2M | 29 | $3.00 | 161 | 0.63 | 3.73 | ||
Grok Code Fast 1 | 256k | 29 | $0.53 | 154 | 3.36 | 6.61 | ||
Nemotron Cascade 2 30B A3B | 262k | 28 | -- | -- | -- | -- | ||
Qwen3 Coder Next | 256k | 28 | $0.60 | 160 | 0.96 | 4.08 | ||
Nova 2.0 Omni (medium) | 1M | 28 | $0.85 | -- | -- | -- | ||
Mistral Small 4 | 256k | 28 | $0.26 | 159 | 0.71 | 16.43 | ||
Qwen3.5 9B | 262k | 27 | -- | -- | -- | -- | ||
Magistral Medium 1.2 | 128k | 27 | $2.75 | 77 | 1.66 | 34.23 | ||
Gemma 4 26B A4B | 256k | 27 | -- | -- | -- | -- | ||
Qwen3.5 4B | 262k | 27 | $0.06 | 175 | 0.55 | 14.85 | ||
DeepSeek R1 0528 | 128k | 27 | $2.36 | -- | -- | -- | ||
Qwen3 Next 80B A3B | 262k | 27 | $1.88 | 162 | 2.29 | 17.72 | ||
Ling 2.6 Flash | 262k | 26 | $0.15 | 209 | 1.07 | 3.46 | ||
Solar Pro 3 | 128k | 26 | -- | -- | -- | -- | ||
Qwen3.5 Omni Flash | 256k | 26 | $0.28 | 165 | 2.02 | 5.06 | ||
JT-MINI | 128k | 25 | -- | -- | -- | -- | ||
Qwen3 Coder 480B | 262k | 25 | $3.00 | 59 | 3.04 | 11.48 | ||
Nova 2.0 Lite (low) | 1M | 25 | $0.85 | 155 | 9.95 | 26.07 | ||
gpt-oss-120B (low) | 131k | 24 | $0.26 | 214 | 0.87 | 12.56 | ||
gpt-oss-20B (high) | 131k | 24 | $0.10 | 261 | 0.69 | 10.26 | ||
GPT-5.4 nano | 400k | 24 | $0.46 | 150 | 0.94 | 4.27 | ||
NVIDIA Nemotron 3 Nano | 1M | 24 | $0.10 | 138 | 1.51 | 19.60 | ||
LongCat Flash Lite | 256k | 24 | $0.00 | 114 | 6.01 | 10.40 | ||
Grok 4.1 Fast | 2M | 24 | $0.28 | 135 | 0.61 | 4.33 | ||
K-EXAONE | 256k | 23 | -- | -- | -- | -- | ||
GPT-5.4 mini | 400k | 23 | $1.69 | 159 | 0.68 | 3.83 | ||
Nova 2.0 Omni (low) | 1M | 23 | $0.85 | -- | -- | -- | ||
Nova 2.0 Pro Preview | 256k | 23 | $3.44 | 120 | 1.04 | 5.22 | ||
Mi:dm K 2.5 Pro | 128k | 23 | -- | -- | -- | -- | ||
Mistral Large 3 | 256k | 23 | $0.75 | 47 | 23.76 | 34.30 | ||
Ring-1T | 128k | 23 | -- | -- | -- | -- | ||
Qwen3.5 4B | 262k | 23 | $0.06 | 179 | 0.54 | 3.33 | ||
INTELLECT-3 | 131k | 22 | -- | -- | -- | -- | ||
Devstral 2 | 256k | 22 | $0.00 | 74 | 1.08 | 7.83 | ||
Solar Open 100B | 128k | 22 | -- | -- | -- | -- | ||
Gemini 2.5 Flash-Lite (Sep) | 1M | 22 | $0.17 | -- | -- | -- | ||
Mistral Medium 3.1 | 128k | 21 | $0.80 | 87 | 1.24 | 6.97 | ||
gpt-oss-20B (low) | 131k | 21 | $0.10 | 277 | 0.77 | 9.80 | ||
Qwen3 Next 80B A3B | 262k | 20 | $0.88 | 152 | 2.32 | 5.60 | ||
Devstral Small 2 | 256k | 19 | $0.00 | 74 | 1.01 | 7.75 | ||
Gemini 2.5 Flash-Lite (Sep) | 1M | 19 | $0.17 | -- | -- | -- | ||
Motif-2-12.7B | 128k | 19 | -- | -- | -- | -- | ||
Ling-1T | 128k | 19 | -- | -- | -- | -- | ||
Nova Premier | 1M | 19 | $5.00 | 27 | 2.97 | 21.48 | ||
Gemma 4 E4B | 128k | 19 | -- | -- | -- | -- | ||
Llama Nemotron Super 49B v1.5 | 128k | 19 | $0.17 | 55 | 1.30 | 46.58 | ||
Mistral Small 4 | 256k | 19 | $0.26 | 135 | 0.71 | 4.42 | ||
Llama 3.3 Nemotron Super 49B | 128k | 18* | -- | -- | -- | -- | ||
Llama 4 Maverick | 1M | 18 | $0.46 | 113 | 1.03 | 5.45 | ||
Magistral Small 1.2 | 128k | 18 | $0.75 | 160 | 0.83 | 16.49 | ||
Sarvam 105B (high) | 128k | 18 | $0.00 | 117 | 2.38 | 23.70 | ||
Nova 2.0 Lite | 1M | 18 | $0.85 | 129 | 1.28 | 5.15 | ||
Llama 3.1 405B | 128k | 17 | $3.69 | 29 | 2.46 | 19.44 | ||
EXAONE 4.0 32B | 131k | 17 | -- | -- | -- | -- | ||
Nova 2.0 Omni | 1M | 17 | $0.85 | 178 | 1.28 | 4.09 | ||
DeepSeek R1 0528 Qwen3 8B | 32.8k | 16* | -- | -- | -- | -- | ||
Qwen3.5 2B | 262k | 16 | $0.04 | -- | -- | -- | ||
Nanbeige4.1-3B | 256k | 16 | -- | -- | -- | -- | ||
Ministral 3 14B | 256k | 16 | $0.20 | 86 | 0.66 | 6.51 | ||
DeepSeek R1 Distill Llama 70B | 128k | 16* | $0.88 | 40 | 1.90 | 63.98 | ||
Falcon-H1R-7B | 256k | 16 | -- | -- | -- | -- | ||
Ling-flash-2.0 | 128k | 16 | $0.25 | 91 | 2.20 | 7.69 | ||
Qwen3 Omni 30B A3B | 65.5k | 16 | $0.43 | 78 | 1.99 | 34.16 | ||
Step3 VL 10B | 65.5k | 15 | -- | -- | -- | -- | ||
Gemma 4 E2B | 128k | 15 | -- | -- | -- | -- | ||
Llama Nemotron Ultra | 128k | 15 | $0.90 | 42 | 2.54 | 62.61 | ||
ERNIE 4.5 300B A47B | 131k | 15 | $0.48 | 29 | 3.94 | 20.90 | ||
Solar Pro 2 | 65.5k | 15 | -- | -- | -- | -- | ||
NVIDIA Nemotron Nano 12B v2 VL | 128k | 15 | $0.30 | 139 | 1.22 | 19.27 | ||
Ministral 3 8B | 256k | 15 | $0.15 | 174 | 0.58 | 3.44 | ||
Gemma 4 E4B | 128k | 15 | -- | -- | -- | -- | ||
NVIDIA Nemotron Nano 9B V2 | 131k | 15 | $0.07 | 115 | 0.68 | 22.35 | ||
NVIDIA Nemotron 3 Nano 4B | 262k | 15 | -- | -- | -- | -- | ||
Qwen3.5 2B | 262k | 15 | $0.04 | 232 | 0.45 | 2.60 | ||
Llama Nemotron Super 49B v1.5 | 128k | 15 | $0.17 | 52 | 1.30 | 10.96 | ||
Llama 3.3 70B | 128k | 14 | $0.68 | 90 | 1.41 | 6.99 | ||
Llama 3.1 Nemotron Nano 4B v1.1 | 128k | 14* | -- | -- | -- | -- | ||
Kimi Linear 48B A3B Instruct | 1M | 14* | -- | -- | -- | -- | ||
Llama 3.3 Nemotron Super 49B | 128k | 14* | -- | -- | -- | -- | ||
Ring-flash-2.0 | 128k | 14 | $0.25 | 88 | 2.25 | 30.74 | ||
Solar Pro 2 | 65.5k | 14 | -- | -- | -- | -- | ||
Llama 4 Scout | 10M | 14 | $0.29 | 129 | 0.82 | 4.68 | ||
Command A | 256k | 13 | $4.38 | 40 | 2.04 | 14.61 | ||
Llama 3.1 Nemotron 70B | 128k | 13 | $1.20 | 44 | 1.73 | 13.08 | ||
NVIDIA Nemotron 3 Nano | 1M | 13 | $0.09 | 74 | 0.50 | 7.25 | ||
NVIDIA Nemotron Nano 9B V2 | 131k | 13 | $0.09 | 146 | 1.06 | 4.47 | ||
Sarvam 30B (high) | 65.5k | 12 | $0.00 | 185 | 2.01 | 15.50 | ||
Gemma 4 E2B | 128k | 12 | -- | -- | -- | -- | ||
R1 1776 | 128k | 12* | -- | -- | -- | -- | ||
Llama 3.2 90B (Vision) | 128k | 12* | $0.72 | 51 | 0.99 | 10.79 | ||
EXAONE 4.0 32B | 131k | 12 | -- | -- | -- | -- | ||
Ministral 3 3B | 256k | 11 | $0.10 | 286 | 0.44 | 2.19 | ||
Jamba 1.7 Large | 256k | 11 | $3.50 | 61 | 1.41 | 9.68 | ||
Granite 4.0 H Small | 128k | 11 | $0.11 | 426 | 10.26 | 11.43 | ||
Qwen3 Omni 30B A3B | 65.5k | 11 | $0.43 | 96 | 1.91 | 7.11 | ||
Qwen3.5 0.8B | 262k | 11 | $0.02 | -- | -- | -- | ||
LFM2 24B A2B | 32.8k | 10 | $0.05 | 187 | 0.48 | 3.14 | ||
Phi-4 | 16k | 10 | $0.22 | 22 | 2.22 | 25.02 | ||
Nova Micro | 130k | 10 | $0.06 | 285 | 0.90 | 2.65 | ||
NVIDIA Nemotron Nano 12B v2 VL | 128k | 10 | $0.30 | 169 | 1.13 | 4.08 | ||
Phi-4 Multimodal | 128k | 10* | $0.00 | 17 | 0.84 | 30.40 | ||
Qwen3.5 0.8B | 262k | 10 | $0.02 | 276 | 0.50 | 2.31 | ||
Jamba Reasoning 3B | 262k | 10 | -- | -- | -- | -- | ||
Reka Flash 3 | 128k | 10 | $0.35 | -- | -- | -- | ||
Ling-mini-2.0 | 131k | 9 | -- | -- | -- | -- | ||
Llama 3.2 11B (Vision) | 128k | 9 | $0.24 | 50 | 0.78 | 10.79 | ||
Phi-4 Mini | 128k | 8 | $0.00 | 42 | 0.84 | 12.63 | ||
Exaone 4.0 1.2B | 64k | 8 | -- | -- | -- | -- | ||
Exaone 4.0 1.2B | 64k | 8 | -- | -- | -- | -- | ||
LFM2.5-1.2B-Thinking | 32k | 8 | -- | -- | -- | -- | ||
Jamba 1.7 Mini | 258k | 8 | -- | -- | -- | -- | ||
LFM2.5-1.2B-Instruct | 32k | 8 | $0.00 | -- | -- | -- | ||
LFM2 2.6B | 32.8k | 8 | $0.00 | -- | -- | -- | ||
Granite 4.0 H 1B | 128k | 8 | -- | -- | -- | -- | ||
Gemma 3 270M | 32k | 8 | -- | -- | -- | -- | ||
Apertus 70B Instruct | 65.5k | 8 | $1.34 | -- | -- | -- | ||
Granite 4.0 Micro | 128k | 8 | -- | -- | -- | -- | ||
Granite 4.0 1B | 128k | 7 | -- | -- | -- | -- | ||
LFM2 8B A1B | 32.8k | 7 | $0.00 | -- | -- | -- | ||
LFM2.5-VL-1.6B | 32k | 6 | $0.00 | -- | -- | -- | ||
Granite 4.0 350M | 32.8k | 6 | -- | -- | -- | -- | ||
Apertus 8B Instruct | 65.5k | 6 | $0.13 | -- | -- | -- | ||
Granite 4.0 H 350M | 32.8k | 5 | -- | -- | -- | -- | ||
Tiny Aya Global | 8.19k | 5 | -- | -- | -- | -- | ||
Gemini 3 Deep Think | 128k | -- | -- | -- | -- | -- | ||
GPT-5.4 Pro (xhigh) | 1.05M | -- | $67.50 | -- | -- | -- | ||
Mi:dm K 2.5 Pro Preview | 128k | -- | -- | -- | -- | -- | ||
Key definitions
Frequently Asked Questions
Claude Opus 4.7 (Adaptive Reasoning, Max Effort) currently ranks #1 on the Artificial Analysis LLM Leaderboard with an Intelligence Index score of 57, out of 328 models ranked.
The top models by Intelligence Index are: 1. Claude Opus 4.7 (Adaptive Reasoning, Max Effort) (57), 2. Gemini 3.1 Pro Preview (57), 3. GPT-5.4 (xhigh) (57), 4. Kimi K2.6 (54), 5. GPT-5.3 Codex (xhigh) (54).
Mercury 2 is the fastest at 707.2 tokens per second, followed by Granite 4.0 H Small (426.0 t/s) and Granite 3.3 8B (Non-reasoning) (379.5 t/s).
Qwen3.5 0.8B (Non-reasoning) is the most affordable at $0.02 per 1M tokens (blended 3:1 input-to-output), followed by Qwen3.5 0.8B (Reasoning) ($0.02) and Gemma 3n E4B Instruct ($0.03).
Kimi K2.6 is the highest-ranked open weights model with an Intelligence Index score of 54. There are 202 open weights models out of 328 total on the leaderboard.
The top open weights models by Intelligence Index are: 1. Kimi K2.6 (54), 2. GLM-5.1 (Reasoning) (51), 3. GLM-5 (Reasoning) (50).
Claude Opus 4.7 (Adaptive Reasoning, Max Effort) leads among 164 reasoning models with an Intelligence Index score of 57. Reasoning models use extended thinking to solve complex problems before responding.
The leaderboard includes filters to narrow results by model type (reasoning vs non-reasoning), openness (open weights vs proprietary), and other criteria. You can also adjust prompt options to see how performance varies with different input lengths.
Click on any model name in the leaderboard to visit its dedicated comparison page with detailed charts covering intelligence, pricing, speed, latency, and more. You can also compare API providers for each model. View all models