Artificial Analysis LLM Performance Leaderboard
Independent performance benchmarks & pricing across API providers of LLMs. Definitions are below the table.
For further analysis and methodology, see artificialanalysis.ai.
Features | Model Intelligence | Price | Output tokens/s | Latency | End-to-End Response Time | ||||
---|---|---|---|---|---|---|---|---|---|
Further Analysis | |||||||||
GPT-5 (high) | 400k | 69 | $3.44 | 143.2 | 63.76 | 67.25 | N/A | ||
![]() | GPT-5 (high) | 272k | 69 | $3.44 | 218.3 | 42.87 | 45.16 | N/A | |
GPT-5 (medium) | 400k | 68 | $3.44 | 179.7 | 35.26 | 38.04 | N/A | ||
![]() | GPT-5 (medium) | 272k | 68 | $3.44 | 216.9 | 23.40 | 25.71 | N/A | |
Grok 4 | 256k | 68 | $6.00 | 48.6 | 6.53 | 16.81 | N/A | ||
o3 | 200k | 67 | $3.50 | 218.8 | 11.05 | 13.34 | N/A | ||
![]() | o3 | 200k | 67 | $3.50 | 85.6 | 28.26 | 34.10 | N/A | |
o4-mini (high) | 200k | 65 | $1.93 | 107.3 | 48.44 | 53.10 | N/A | ||
![]() | o4-mini (high) | 200k | 65 | $1.93 | 160.4 | 23.75 | 26.86 | N/A | |
Gemini 2.5 Pro (AI_Studio) | 1m | 65 | $3.44 | 140.2 | 30.36 | 33.93 | N/A | ||
Gemini 2.5 Pro Vertex | 1m | 65 | $3.44 | 140.3 | 34.26 | 37.82 | N/A | ||
GPT-5 mini (medium) | 400k | 64 | $0.69 | 73.6 | 29.17 | 35.96 | N/A | ||
![]() | GPT-5 mini (medium) | 400k | 64 | $0.69 | 146.3 | 14.50 | 17.92 | N/A | |
![]() | Qwen3 235B 2507 (Reasoning) | 256k | 64 | $1.24 | 66.7 | 0.43 | 37.91 | 29.99 | |
![]() | Qwen3 235B 2507 (Reasoning) | 131k | 64 | $0.75 | 1,490.2 | 0.26 | 1.94 | 1.34 | |
Qwen3 235B 2507 (Reasoning) | 262k | 64 | $0.39 | 114.1 | 0.65 | 22.55 | 17.52 | ||
Qwen3 235B 2507 (Reasoning) (FP8) | 262k | 64 | $0.25 | 28.5 | 0.59 | 88.28 | 70.15 | ||
![]() | Qwen3 235B 2507 (Reasoning) | 131k | 64 | $0.97 | 49.7 | 0.80 | 51.09 | 40.23 | |
Qwen3 235B 2507 (Reasoning) (FP8) | 131k | 64 | $1.20 | 66.3 | 0.53 | 38.24 | 30.17 | ||
Qwen3 235B 2507 (Reasoning) | 262k | 64 | $1.24 | 60.0 | 0.31 | 41.98 | 33.34 | ||
GPT-5 (low) | 400k | 63 | $3.44 | 184.8 | 14.17 | 16.88 | N/A | ||
![]() | gpt-oss-120B (high) | 131k | 61 | $0.26 | 169.9 | 0.46 | 15.18 | 11.77 | |
![]() | gpt-oss-120B (high) | 131k | 61 | $0.36 | 3,188.1 | 0.27 | 1.05 | 0.63 | |
![]() | gpt-oss-120B (high) | 131k | 61 | $0.26 | 223.4 | 0.38 | 11.57 | 8.95 | |
gpt-oss-120B (high) Base | 128k | 61 | $0.26 | 193.2 | 0.54 | 13.48 | 10.35 | ||
gpt-oss-120B (high) Vertex | 131k | 61 | $0.26 | 230.0 | 0.18 | 11.05 | 8.70 | ||
![]() | gpt-oss-120B (high) | 131k | 61 | $0.10 | 193.0 | 0.21 | 13.16 | 10.36 | |
![]() | gpt-oss-120B (high) | 131k | 61 | $0.26 | 270.3 | 36.92 | 46.17 | 7.40 | |
gpt-oss-120B (high) | 131k | 61 | $0.26 | 307.2 | 0.44 | 8.58 | 6.51 | ||
gpt-oss-120B (high) | 131k | 61 | $0.18 | 173.0 | 0.23 | 14.68 | 11.56 | ||
![]() | gpt-oss-120B (high) | 131k | 61 | $0.20 | 194.1 | 0.68 | 13.56 | 10.30 | |
gpt-oss-120B (high) | 131k | 61 | $0.26 | 228.9 | 33.58 | 44.51 | 8.74 | ||
gpt-oss-120B (high) | 131k | 61 | $0.30 | 498.8 | 0.19 | 5.21 | 4.01 | ||
gpt-oss-120B (high) | 131k | 61 | $0.26 | 333.7 | 43.10 | 50.59 | 5.99 | ||
gpt-oss-120B (high) | 128k | 61 | $0.45 | 125.3 | 105.76 | 125.72 | 15.97 | ||
![]() | DeepSeek V3.1 (Reasoning) | 128k | 60 | $0.96 | 19.4 | 2.95 | 131.61 | 102.92 | |
DeepSeek V3.1 (Reasoning) (FP8) | 164k | 60 | $0.90 | 22.4 | 1.09 | 112.71 | 89.30 | ||
![]() | DeepSeek V3.1 (Reasoning) | 33k | 60 | $3.38 | 169.7 | 1.63 | 16.37 | 11.79 | |
![]() | Claude 4 Sonnet Thinking | 1m | 59 | $6.00 | 48.6 | 1.20 | 52.65 | 41.16 | |
Claude 4 Sonnet Thinking Vertex | 1m | 59 | $6.00 | 50.1 | 1.44 | 51.38 | 39.95 | ||
Claude 4 Sonnet Thinking | 1m | 59 | $6.00 | 44.9 | 1.78 | 57.43 | 44.52 | ||
DeepSeek R1 0528 | 164k | 59 | $0.92 | 39.8 | 0.40 | 63.14 | 50.19 | ||
![]() | DeepSeek R1 0528 | 64k | 59 | $0.96 | 19.1 | 2.82 | 133.76 | 104.75 | |
![]() | DeepSeek R1 0528 | 164k | 59 | $1.59 | 60.9 | 0.48 | 41.52 | 32.84 | |
DeepSeek R1 0528 | 164k | 59 | $3.00 | 90.4 | 2.46 | 30.12 | 22.13 | ||
DeepSeek R1 0528 | 164k | 59 | $1.20 | 23.6 | 0.64 | 106.47 | 84.66 | ||
DeepSeek R1 0528 Fast | 164k | 59 | $3.00 | 281.5 | 1.07 | 9.95 | 7.10 | ||
DeepSeek R1 0528 (Vertex) | 164k | 59 | $2.36 | 213.9 | 0.50 | 12.19 | 9.35 | ||
![]() | DeepSeek R1 0528 | 128k | 59 | $0.53 | 48.8 | 0.26 | 51.47 | 40.97 | |
![]() | DeepSeek R1 0528 | 128k | 59 | $2.36 | 76.9 | 0.54 | 33.06 | 26.02 | |
DeepSeek R1 0528 Fast | 164k | 59 | $4.25 | 231.6 | 0.53 | 11.32 | 8.64 | ||
DeepSeek R1 0528 | 164k | 59 | $0.91 | 80.4 | 0.37 | 31.45 | 24.87 | ||
![]() | DeepSeek R1 0528 | 164k | 59 | $1.15 | 34.1 | 0.60 | 73.95 | 58.68 | |
DeepSeek R1 0528 | 131k | 59 | $1.18 | 112.0 | 0.51 | 22.83 | 17.86 | ||
![]() | DeepSeek R1 0528 | 33k | 59 | $5.50 | 175.6 | 1.92 | 16.15 | 11.39 | |
DeepSeek R1 0528 | 164k | 59 | $4.00 | 273.9 | 0.82 | 9.95 | 7.30 | ||
DeepSeek R1 0528 (Throughput) | 164k | 59 | $0.96 | 47.6 | 0.73 | 53.30 | 42.06 | ||
Gemini 2.5 Flash (Reasoning) (AI_Studio) | 1m | 58 | $0.85 | 253.0 | 15.70 | 17.68 | N/A | ||
Gemini 2.5 Flash (Reasoning) (Vertex) | 1m | 58 | $0.85 | 304.8 | 13.30 | 14.94 | N/A | ||
Grok 3 mini Reasoning (high) | 131k | 58 | $0.35 | 186.8 | 0.58 | 13.96 | 10.71 | ||
Grok 3 mini Reasoning (high) Fast | 131k | 58 | $1.45 | 187.9 | 0.69 | 14.00 | 10.64 | ||
![]() | Grok 3 mini Reasoning (high) | 32k | 58 | $0.00 | 168.1 | 0.41 | 15.28 | 11.90 | |
![]() | GLM-4.5 (FP8) | 131k | 56 | $0.97 | 38.6 | 0.56 | 65.24 | 51.75 | |
GLM-4.5 | 131k | 56 | $0.91 | 58.9 | 0.38 | 42.85 | 33.97 | ||
![]() | Claude 4 Opus Thinking | 200k | 55 | $30.00 | 18.0 | 2.84 | 141.90 | 111.25 | |
Claude 4 Opus Thinking Vertex | 200k | 55 | $30.00 | 42.4 | 1.73 | 60.67 | 47.16 | ||
Claude 4 Opus Thinking | 200k | 55 | $30.00 | 43.4 | 1.74 | 59.30 | 46.05 | ||
Qwen3 30B 2507 (Reasoning) | 262k | 54 | $0.15 | 135.4 | 0.62 | 19.07 | 14.77 | ||
Qwen3 30B 2507 (Reasoning) (FP8) | 262k | 54 | $0.90 | 184.3 | 0.46 | 14.02 | 10.85 | ||
GPT-5 nano (medium) | 400k | 54 | $0.14 | 186.3 | 32.47 | 35.15 | N/A | ||
![]() | GPT-5 nano (medium) | 400k | 54 | $0.14 | 253.3 | 25.85 | 27.82 | N/A | |
![]() | GLM-4.5-Air | 128k | 53 | $0.32 | 93.3 | 1.31 | 28.11 | 21.44 | |
GLM-4.5-Air | 131k | 53 | $0.42 | 157.4 | 0.24 | 16.13 | 12.71 | ||
GLM-4.5-Air (FP8) | 131k | 53 | $0.42 | 201.4 | 0.37 | 12.78 | 9.93 | ||
![]() | Qwen3 235B 2507 (Non-reasoning) | 262k | 51 | $0.33 | 57.2 | 0.42 | 9.17 | N/A | |
![]() | Qwen3 235B 2507 (Non-reasoning) | 131k | 51 | $0.75 | 1,288.0 | 0.32 | 0.71 | N/A | |
Qwen3 235B 2507 (Non-reasoning) | 262k | 51 | $2.00 | 59.7 | 2.09 | 10.46 | N/A | ||
Qwen3 235B 2507 (Non-reasoning) | 262k | 51 | $0.30 | 59.4 | 0.65 | 9.06 | N/A | ||
Qwen3 235B 2507 (Non-reasoning) Vertex | 256k | 51 | $0.44 | 84.4 | 0.43 | 6.36 | N/A | ||
Qwen3 235B 2507 (Non-reasoning) (FP8) | 262k | 51 | $0.39 | 100.2 | 0.64 | 5.63 | N/A | ||
Qwen3 235B 2507 (Non-reasoning) | 262k | 51 | $0.25 | 18.0 | 0.38 | 28.19 | N/A | ||
![]() | Qwen3 235B 2507 (Non-reasoning) | 262k | 51 | $0.31 | 27.8 | 0.99 | 18.99 | N/A | |
Qwen3 235B 2507 (Non-reasoning) (FP8) | 131k | 51 | $0.40 | 54.0 | 0.53 | 9.80 | N/A | ||
Qwen3 235B 2507 (Non-reasoning) (FP8) | 262k | 51 | $0.30 | 36.0 | 0.29 | 14.16 | N/A | ||
![]() EXAONE 4.0 32B (Reasoning) | 131k | 51 | $0.70 | 81.3 | 0.27 | 31.02 | 24.60 | ||
![]() | gpt-oss-20B (high) (Fast) | 131k | 49 | $0.09 | 336.5 | 5.31 | 12.73 | 5.94 | |
![]() | gpt-oss-20B (high) | 131k | 49 | $0.13 | 260.7 | 13.38 | 22.97 | 7.67 | |
gpt-oss-20B (high) Base | 128k | 49 | $0.09 | 182.9 | 0.55 | 14.22 | 10.94 | ||
gpt-oss-20B (high) Vertex | 131k | 49 | $0.13 | 328.5 | 0.15 | 7.76 | 6.09 | ||
![]() | gpt-oss-20B (high) | 131k | 49 | $0.05 | 264.2 | 0.19 | 9.65 | 7.57 | |
gpt-oss-20B (high) | 131k | 49 | $0.09 | 352.0 | 0.58 | 7.68 | 5.68 | ||
gpt-oss-20B (high) | 131k | 49 | $0.07 | 190.2 | 0.20 | 13.34 | 10.52 | ||
![]() | gpt-oss-20B (high) | 131k | 49 | $0.09 | 191.1 | 0.53 | 13.61 | 10.47 | |
gpt-oss-20B (high) | 131k | 49 | $0.20 | 1,117.5 | 0.23 | 2.47 | 1.79 | ||
gpt-oss-20B (high) | 131k | 49 | $0.09 | 233.8 | 7.28 | 17.98 | 8.55 | ||
gpt-oss-20B (high) | 128k | 49 | $0.23 | 94.1 | 26.51 | 53.08 | 21.26 | ||
![]() | DeepSeek V3.1 (Non-reasoning) | 128k | 49 | $0.48 | 19.2 | 2.87 | 28.93 | N/A | |
DeepSeek V3.1 (Non-reasoning) (FP8) | 164k | 49 | $0.47 | 53.9 | 0.35 | 9.63 | N/A | ||
![]() | DeepSeek V3.1 (Non-reasoning) | 33k | 49 | $3.38 | 133.4 | 2.25 | 6.00 | N/A | |
![]() | Kimi K2 | 131k | 49 | $2.13 | 42.2 | 0.51 | 12.36 | N/A | |
Kimi K2 | 131k | 49 | $0.97 | 51.6 | 0.64 | 10.33 | N/A | ||
Kimi K2 | 131k | 49 | $1.07 | 77.2 | 0.54 | 7.02 | N/A | ||
Kimi K2 | 131k | 49 | $0.88 | 44.4 | 0.26 | 11.51 | N/A | ||
![]() | Kimi K2 | 131k | 49 | $1.00 | 57.2 | 1.23 | 9.96 | N/A | |
Kimi K2 | 131k | 49 | $1.50 | 25.4 | 0.58 | 20.26 | N/A | ||
Kimi K2 | 131k | 49 | $1.50 | 346.2 | 0.23 | 1.67 | N/A | ||
Kimi K2 | 131k | 49 | $1.50 | 19.1 | 0.72 | 26.84 | N/A | ||
Kimi K2 | 131k | 49 | $1.07 | 72.6 | 0.19 | 7.07 | N/A | ||
Gemini 2.5 Flash (AI_Studio) | 1m | 47 | $0.85 | 222.0 | 0.32 | 2.57 | N/A | ||
Gemini 2.5 Flash (Vertex) | 1m | 47 | $0.85 | 233.7 | 0.32 | 2.46 | N/A | ||
Gemini 2.5 Flash-Lite (Reasoning) (AI Studio) | 1m | 47 | $0.17 | 439.0 | 8.72 | 9.86 | N/A | ||
GPT-4.1 | 1m | 47 | $3.50 | 117.0 | 0.53 | 4.80 | N/A | ||
![]() | GPT-4.1 | 1m | 47 | $3.50 | 146.4 | 0.84 | 4.26 | N/A | |
![]() | Claude 4 Opus | 200k | 47 | $30.00 | 21.9 | 2.83 | 25.66 | N/A | |
Claude 4 Opus Vertex | 200k | 47 | $30.00 | 42.0 | 1.85 | 13.76 | N/A | ||
Claude 4 Opus | 200k | 47 | $30.00 | 39.6 | 1.95 | 14.57 | N/A | ||
Llama Nemotron Ultra Reasoning Base | 131k | 46 | $0.90 | 37.1 | 0.68 | 68.06 | 53.90 | ||
![]() | Claude 4 Sonnet | 1m | 46 | $6.00 | 69.1 | 1.25 | 8.48 | N/A | |
Claude 4 Sonnet Vertex | 1m | 46 | $6.00 | 68.3 | 1.50 | 8.82 | N/A | ||
Claude 4 Sonnet | 1m | 46 | $6.00 | 70.3 | 1.52 | 8.63 | N/A | ||
GPT-4.1 mini | 1m | 46 | $0.70 | 63.5 | 0.47 | 8.35 | N/A | ||
![]() | GPT-4.1 mini | 1m | 46 | $0.70 | 104.2 | 0.83 | 5.63 | N/A | |
![]() | Qwen3 Coder 480B (FP8) | 262k | 45 | $1.63 | 54.2 | 0.43 | 9.66 | N/A | |
![]() | Qwen3 Coder 480B | 131k | 45 | $2.00 | 1,490.0 | 0.26 | 0.60 | N/A | |
Qwen3 Coder 480B (FP8) | 262k | 45 | $2.00 | 65.2 | 3.48 | 11.15 | N/A | ||
Qwen3 Coder 480B | 262k | 45 | $0.75 | 62.6 | 0.62 | 8.60 | N/A | ||
Qwen3 Coder 480B Vertex | 262k | 45 | $1.75 | 70.7 | 0.38 | 7.45 | N/A | ||
Qwen3 Coder 480B | 262k | 45 | $0.79 | 114.1 | 0.51 | 4.89 | N/A | ||
Qwen3 Coder 480B (FP8) | 262k | 45 | $0.70 | 64.2 | 0.34 | 8.12 | N/A | ||
Qwen3 Coder 480B (Turbo, FP4) | 262k | 45 | $0.53 | 70.0 | 0.22 | 7.37 | N/A | ||
![]() | Qwen3 Coder 480B | 262k | 45 | $1.10 | 67.9 | 0.75 | 8.12 | N/A | |
Qwen3 Coder 480B (FP8) | 262k | 45 | $1.10 | 84.9 | 0.51 | 6.40 | N/A | ||
Qwen3 Coder 480B (FP8) | 262k | 45 | $2.00 | 46.7 | 0.43 | 11.13 | N/A | ||
Qwen3 30B 2507 (Non-reasoning) | 262k | 44 | $0.15 | 125.0 | 0.59 | 4.59 | N/A | ||
Qwen3 30B 2507 (Non-reasoning) (FP8) | 262k | 44 | $0.50 | 165.1 | 0.49 | 3.52 | N/A | ||
GPT-5 (minimal) | 400k | 44 | $3.44 | 129.5 | 1.06 | 4.92 | N/A | ||
![]() | Solar Pro 2 (Reasoning) | 66k | 43 | $0.50 | 101.6 | 1.33 | 25.93 | 19.68 | |
QwQ-32B | 131k | 42 | $0.20 | 110.6 | 3.46 | 30.49 | 22.52 | ||
QwQ-32B Fast | 131k | 42 | $0.75 | 80.7 | 0.54 | 37.61 | 30.88 | ||
QwQ-32B Base | 131k | 42 | $0.23 | 50.4 | 0.60 | 59.89 | 49.39 | ||
QwQ-32B | 131k | 42 | $0.09 | 35.5 | 0.53 | 84.89 | 70.25 | ||
QwQ-32B | 131k | 42 | $0.75 | 47.3 | 0.40 | 63.59 | 52.63 | ||
QwQ-32B | 131k | 42 | $1.20 | 86.5 | 0.27 | 34.86 | 28.81 | ||
Llama 4 Maverick (FP8) | 1m | 42 | $0.28 | 131.4 | 0.23 | 4.03 | N/A | ||
![]() | Llama 4 Maverick (FP8) | 1m | 42 | $0.35 | 125.2 | 0.37 | 4.36 | N/A | |
![]() | Llama 4 Maverick | 32k | 42 | $0.30 | 2,309.9 | 0.26 | 0.48 | N/A | |
![]() | Llama 4 Maverick | 128k | 42 | $0.42 | 273.8 | 0.62 | 2.44 | N/A | |
Llama 4 Maverick Vertex | 524k | 42 | $0.55 | 248.2 | 0.32 | 2.34 | N/A | ||
![]() | Llama 4 Maverick (FP8) | 128k | 42 | $0.61 | 140.7 | 0.34 | 3.89 | N/A | |
Llama 4 Maverick (Base) | 1m | 42 | $0.39 | 26.4 | 2.23 | 21.17 | N/A | ||
Llama 4 Maverick (FP8) | 1m | 42 | $0.26 | 60.1 | 0.30 | 8.62 | N/A | ||
Llama 4 Maverick (Turbo, FP8) | 8k | 42 | $0.50 | 824.9 | 0.21 | 0.81 | N/A | ||
![]() | Llama 4 Maverick (FP8) | 1m | 42 | $0.34 | 82.3 | 0.46 | 6.53 | N/A | |
Llama 4 Maverick (FP8) | 1m | 42 | $0.39 | 141.3 | 0.81 | 4.35 | N/A | ||
Llama 4 Maverick | 131k | 42 | $0.30 | 501.1 | 0.16 | 1.16 | N/A | ||
![]() | Llama 4 Maverick | 131k | 42 | $0.92 | 693.8 | 0.36 | 1.08 | N/A | |
Llama 4 Maverick | 1m | 42 | $0.41 | 78.5 | 0.21 | 6.58 | N/A | ||
![]() | DeepSeek R1 0528 Qwen3 8B | 131k | 39 | $0.06 | 91.7 | 0.38 | 27.64 | 21.81 | |
![]() | DeepSeek R1 0528 Qwen3 8B | 128k | 39 | $0.07 | 79.6 | 0.84 | 32.23 | 25.12 | |
![]() | ![]() Mistral Medium 3 | 131k | 39 | $0.80 | 47.9 | 0.42 | 10.86 | N/A | |
![]() | ![]() Mistral Medium 3 | 128k | 39 | $0.80 | 47.9 | 0.43 | 10.87 | N/A | |
![]() | ![]() Mistral Medium 3.1 | 131k | 38 | $0.80 | 42.8 | 0.40 | 12.08 | N/A | |
![]() | ![]() Magistral Medium | 41k | 38 | $2.75 | 146.3 | 0.39 | 17.48 | 13.67 | |
![]() EXAONE 4.0 32B | 131k | 37 | $0.70 | 76.2 | 0.28 | 6.85 | N/A | ||
![]() | ![]() Magistral Small | 40k | 36 | $0.75 | 181.6 | 0.35 | 14.12 | 11.01 | |
Qwen3 Coder 30B | 262k | 36 | $0.15 | 127.2 | 0.54 | 4.47 | N/A | ||
Qwen3 Coder 30B | 262k | 36 | $0.26 | 180.7 | 0.48 | 3.24 | N/A | ||
Gemini 2.5 Flash-Lite (AI Studio) | 1m | 35 | $0.17 | 314.1 | 0.21 | 1.80 | N/A | ||
![]() | Nova Premier | 1m | 35 | $5.00 | 75.9 | 0.87 | 7.46 | N/A | |
![]() | Solar Pro 2 | 66k | 33 | $0.50 | 106.1 | 1.30 | 6.01 | N/A | |
Llama 4 Scout | 1m | 33 | $0.14 | 88.4 | 0.18 | 5.83 | N/A | ||
![]() | Llama 4 Scout | 32k | 33 | $0.70 | 2,068.4 | 0.27 | 0.52 | N/A | |
![]() | Llama 4 Scout | 128k | 33 | $0.29 | 137.3 | 0.54 | 4.18 | N/A | |
Llama 4 Scout Vertex | 1m | 33 | $0.36 | 129.4 | 0.51 | 4.38 | N/A | ||
![]() | Llama 4 Scout | 10m | 33 | $0.11 | 113.3 | 0.53 | 4.94 | N/A | |
![]() | Llama 4 Scout | 128k | 33 | $0.34 | 103.3 | 0.32 | 5.16 | N/A | |
Llama 4 Scout (Base) | 10m | 33 | $0.26 | 137.5 | 0.53 | 4.17 | N/A | ||
Llama 4 Scout | 328k | 33 | $0.14 | 49.8 | 0.30 | 10.34 | N/A | ||
![]() | Llama 4 Scout | 131k | 33 | $0.20 | 63.1 | 0.83 | 8.76 | N/A | |
Llama 4 Scout | 1m | 33 | $0.18 | 112.6 | 0.41 | 4.85 | N/A | ||
Llama 4 Scout | 131k | 33 | $0.17 | 435.6 | 0.20 | 1.35 | N/A | ||
Llama 4 Scout | 1m | 33 | $0.28 | 93.0 | 0.19 | 5.57 | N/A | ||
Llama 4 Scout | 131k | 33 | $0.41 | 109.2 | 0.71 | 5.29 | N/A | ||
![]() | ![]() Mistral Small 3.2 | 131k | 32 | $0.15 | 140.0 | 0.30 | 3.87 | N/A | |
![]() Mistral Small 3.2 (FP8) | 128k | 32 | $0.06 | 27.6 | 0.35 | 18.47 | N/A | ||
![]() Command A | 256k | 32 | $4.38 | 150.7 | 0.21 | 3.53 | N/A | ||
![]() | ![]() Command A | 256k | 32 | $4.38 | 38.3 | 0.64 | 13.69 | N/A | |
GPT-4.1 nano | 1m | 32 | $0.17 | 158.0 | 0.40 | 3.56 | N/A | ||
![]() | GPT-4.1 nano | 1m | 32 | $0.17 | 169.2 | 0.64 | 3.60 | N/A | |
![]() | ![]() Devstral Medium | 131k | 31 | $0.80 | 104.0 | 0.39 | 5.20 | N/A | |
Llama 3.3 70B (FP8) | 128k | 31 | $0.17 | 38.2 | 0.23 | 13.33 | N/A | ||
![]() | Llama 3.3 70B (FP8) | 131k | 31 | $0.28 | 86.4 | 0.41 | 6.20 | N/A | |
![]() | Llama 3.3 70B | 128k | 31 | $0.94 | 2,143.8 | 0.24 | 0.47 | N/A | |
Llama 3.3 70B | 131k | 31 | $0.40 | 27.0 | 2.98 | 21.51 | N/A | ||
![]() | Llama 3.3 70B | 128k | 31 | $0.71 | 177.0 | 0.54 | 3.37 | N/A | |
Llama 3.3 70B Fast | 128k | 31 | $0.38 | 199.5 | 0.57 | 3.08 | N/A | ||
Llama 3.3 70B Base | 128k | 31 | $0.20 | 30.5 | 0.73 | 17.13 | N/A | ||
Llama 3.3 70B Vertex | 128k | 31 | $0.72 | 115.6 | 0.21 | 4.53 | N/A | ||
Llama 3.3 70B Snowflake | 8k | 31 | $0.58 | 148.5 | 0.32 | 3.69 | N/A | ||
![]() | Llama 3.3 70B | 128k | 31 | $0.40 | 206.8 | 0.24 | 2.66 | N/A | |
![]() | Llama 3.3 70B | 128k | 31 | $0.71 | 41.0 | 0.43 | 12.64 | N/A | |
Llama 3.3 70B | 131k | 31 | $0.90 | 147.0 | 0.42 | 3.83 | N/A | ||
Llama 3.3 70B (Turbo, FP8) | 131k | 31 | $0.06 | 41.4 | 0.36 | 12.44 | N/A | ||
Llama 3.3 70B | 131k | 31 | $0.27 | 23.4 | 0.58 | 21.93 | N/A | ||
Llama 3.3 70B | 128k | 31 | $0.60 | 142.2 | 0.29 | 3.80 | N/A | ||
Llama 3.3 70B | 128k | 31 | $0.60 | 140.2 | 0.29 | 3.86 | N/A | ||
![]() | Llama 3.3 70B | 131k | 31 | $0.20 | 36.1 | 0.60 | 14.43 | N/A | |
Llama 3.3 70B | 131k | 31 | $0.64 | 380.9 | 0.24 | 1.56 | N/A | ||
![]() | Llama 3.3 70B | 128k | 31 | $0.75 | 369.3 | 0.42 | 1.78 | N/A | |
Llama 3.3 70B Turbo | 131k | 31 | $0.88 | 89.1 | 0.36 | 5.97 | N/A | ||
Llama 3.3 70B | 24k | 31 | $0.78 | 27.1 | 0.93 | 19.40 | N/A | ||
Llama 3.1 405B | 128k | 29 | $9.50 | 15.9 | 1.04 | 32.51 | N/A | ||
Llama 3.1 405B | 131k | 29 | $4.00 | 78.1 | 2.27 | 8.67 | N/A | ||
![]() | Llama 3.1 405B Standard | 128k | 29 | $2.40 | 25.6 | 1.83 | 21.32 | N/A | |
![]() | Llama 3.1 405B Latency Optimized | 128k | 29 | $3.00 | 75.9 | 0.44 | 7.03 | N/A | |
Llama 3.1 405B Base | 128k | 29 | $1.50 | 27.7 | 0.74 | 18.77 | N/A | ||
Llama 3.1 405B Vertex | 128k | 29 | $7.75 | 25.1 | 0.41 | 20.29 | N/A | ||
![]() | Llama 3.1 405B | 128k | 29 | $8.00 | 26.3 | 0.47 | 19.50 | N/A | |
Llama 3.1 405B | 131k | 29 | $3.00 | 78.9 | 0.55 | 6.88 | N/A | ||
![]() | Llama 3.1 405B | 16k | 29 | $6.25 | 145.8 | 0.60 | 4.03 | N/A | |
Llama 3.1 405B | 128k | 29 | $7.50 | 31.4 | 0.89 | 16.81 | N/A | ||
Llama 3.1 405B Turbo | 131k | 29 | $3.50 | 74.4 | 0.52 | 7.24 | N/A | ||
Phi-4 | 16k | 28 | $0.15 | 96.9 | 0.56 | 5.72 | N/A | ||
![]() | Phi-4 | 16k | 28 | $0.22 | 36.2 | 0.42 | 14.24 | N/A | |
Phi-4 | 16k | 28 | $0.09 | 37.3 | 0.25 | 13.64 | N/A | ||
Llama 3.1 Nemotron 70B | 131k | 26 | $0.17 | 30.9 | 0.62 | 16.79 | N/A | ||
![]() | Gemma 3 27B | 131k | 25 | $0.29 | 52.4 | 0.42 | 9.97 | N/A | |
Gemma 3 27B (AI_Studio) | 128k | 25 | $0.00 | 47.0 | 0.61 | 11.25 | N/A | ||
Gemma 3 27B | 131k | 25 | $0.11 | 30.0 | 0.43 | 17.08 | N/A | ||
![]() Jamba 1.7 Large | 256k | 24 | $3.50 | 45.1 | 0.83 | 11.92 | N/A | ||
Gemma 3 12B | 131k | 24 | $0.06 | 40.3 | 0.31 | 12.70 | N/A | ||
Gemma 3 12B | 80k | 24 | $0.40 | 77.5 | 0.59 | 7.05 | N/A | ||
Gemma 3n E4B | 33k | 18 | $0.03 | 72.9 | 0.32 | 7.18 | N/A | ||
Gemma 3 4B | 131k | 18 | $0.03 | 57.7 | 0.58 | 9.24 | N/A | ||
Granite 3.3 8B | 128k | 18 | $0.09 | 103.9 | 0.37 | 5.18 | N/A | ||
![]() | Llama 3.2 11B (Vision) | 128k | 17 | $0.16 | 155.5 | 0.46 | 3.67 | N/A | |
![]() | Llama 3.2 11B (Vision) | 128k | 17 | $0.37 | 72.2 | 0.35 | 7.27 | N/A | |
Llama 3.2 11B (Vision) | 131k | 17 | $0.05 | 42.5 | 0.22 | 11.98 | N/A | ||
GPT-4o mini | 128k | $0.26 | 58.3 | 0.50 | 9.08 | N/A | |||
![]() | GPT-4o mini | 128k | $0.26 | 68.4 | 1.00 | 8.31 | N/A | ||
o3-mini | 200k | $1.93 | 158.4 | 13.46 | 16.62 | N/A | |||
![]() | o3-mini | 200k | $1.93 | 160.0 | 12.03 | 15.16 | N/A | ||
o3-mini (high) | 200k | $1.93 | 152.7 | 41.55 | 44.82 | N/A | |||
![]() | o3-mini (high) | 200k | $1.93 | 158.3 | 37.77 | 40.93 | N/A | ||
o3-pro | 200k | $35.00 | 20.1 | 121.55 | 146.40 | N/A | |||
![]() | o3-pro | 200k | $35.00 | 22.2 | 128.66 | 151.21 | N/A | ||
![]() | Llama 3.2 90B (Vision) | 128k | $0.72 | 49.5 | 0.50 | 10.60 | N/A | ||
Llama 3.2 90B (Vision) Vertex | 128k | $0.00 | 26.9 | 0.19 | 18.80 | N/A | |||
![]() | Llama 3.2 90B (Vision) | 128k | $2.04 | 35.3 | 0.35 | 14.52 | N/A | ||
Llama 3.2 90B (Vision) | 33k | $0.36 | 25.3 | 0.37 | 20.10 | N/A | |||
Gemma 3n E2B (AI Studio) | 32k | $0.00 | 47.1 | 0.27 | 10.89 | N/A | |||
![]() | Claude 4.1 Opus Thinking | 200k | $30.00 | 19.1 | 3.55 | 134.63 | 104.87 | ||
Claude 4.1 Opus Thinking Vertex | 200k | $30.00 | 40.4 | 1.64 | 63.52 | 49.50 | |||
Claude 4.1 Opus Thinking | 200k | $30.00 | 22.5 | 1.52 | 112.50 | 88.78 | |||
![]() | Claude 4.1 Opus | 200k | $30.00 | 20.1 | 3.60 | 28.42 | N/A | ||
Claude 4.1 Opus Vertex | 200k | $30.00 | 42.1 | 1.63 | 13.51 | N/A | |||
Claude 4.1 Opus | 200k | $30.00 | 23.7 | 1.61 | 22.69 | N/A | |||
![]() | ![]() Ministral 8B | 131k | $0.10 | 164.5 | 0.30 | 3.34 | N/A | ||
![]() | ![]() Ministral 3B | 131k | $0.04 | 265.1 | 0.30 | 2.19 | N/A | ||
![]() | ![]() Codestral (Jan '25) | 262k | $0.45 | 169.5 | 0.31 | 3.26 | N/A | ||
![]() Codestral (Jan '25) Vertex | 128k | $0.45 | 124.8 | 0.18 | 4.19 | N/A | |||
![]() | ![]() Devstral Small | 131k | $0.15 | 141.6 | 0.34 | 3.87 | N/A | ||
![]() Devstral Small | 128k | $0.12 | 123.2 | 0.55 | 4.61 | N/A | |||
![]() Devstral Small | 128k | $0.12 | 81.4 | 0.53 | 6.68 | N/A | |||
Grok 3 | 131k | $6.00 | 35.0 | 1.14 | 15.44 | N/A | |||
Grok 3 Fast | 131k | $10.00 | 67.7 | 0.73 | 8.12 | N/A | |||
![]() | Grok 3 | 16k | $6.00 | 59.2 | 0.53 | 8.98 | N/A | ||
Grok 3 mini Reasoning (low) | 131k | $0.35 | 141.4 | 0.59 | 18.27 | 14.15 | |||
Grok 3 mini Reasoning (low) Fast | 131k | $1.45 | 188.5 | 0.57 | 13.84 | 10.61 | |||
![]() | Phi-4 Multimodal | 128k | $0.00 | 17.4 | 0.34 | 29.05 | N/A | ||
![]() Aya Expanse 32B | 128k | $0.75 | 100.2 | 0.17 | 5.16 | N/A | |||
![]() Aya Expanse 8B | 8k | $0.75 | 135.1 | 0.14 | 3.84 | N/A | |||
![]() Jamba 1.7 Mini | 258k | $0.25 | 137.0 | 0.72 | 4.37 | N/A |
Key definitions
Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Latency (Time to First Token): Time to first token received, in seconds, after API request sent. For reasoning models which share reasoning tokens, this will be the first reasoning token. For models which do not support streaming, this represents time to receive the completion.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Input Price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Time period: Metrics are 'live' and are based on the past 72 hours of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.