Artificial Analysis LLM Performance Leaderboard
Independent performance benchmarks & pricing across API providers of LLMs. Definitions are below the table.
For further analysis and methodology, see artificialanalysis.ai.
For further analysis and methodology, see artificialanalysis.ai.
Features | Model Intelligence | Price | Output tokens/s | Latency | |||
---|---|---|---|---|---|---|---|
Further Analysis | |||||||
o3-mini (high) | 200k | 66 | $1.93 | 202.9 | 36.43 | ||
![]() | o3-mini (high) | 200k | 66 | $1.93 | 129.0 | 53.32 | |
o3-mini | 200k | 63 | $1.93 | 186.9 | 14.01 | ||
![]() | o3-mini | 200k | 63 | $1.93 | 125.4 | 21.73 | |
o1 | 200k | 62 | $26.25 | 99.8 | 28.36 | ||
![]() | o1 | 200k | 62 | $26.25 | 111.2 | 26.09 | |
![]() | ![]() DeepSeek R1 | 64k | 60 | $0.96 | 25.8 | 4.23 | |
![]() DeepSeek R1 | 128k | 60 | $2.00 | 109.9 | 1.40 | ||
![]() | ![]() DeepSeek R1 | 128k | 60 | $2.36 | 65.9 | 0.44 | |
![]() DeepSeek R1 Base | 128k | 60 | $1.20 | 24.0 | 0.66 | ||
![]() DeepSeek R1 Fast | 128k | 60 | $3.00 | 54.0 | 0.68 | ||
![]() | ![]() DeepSeek R1 | 128k | 60 | $3.99 | 69.3 | 0.57 | |
![]() | ![]() DeepSeek R1 | 128k | 60 | $2.36 | 62.9 | 0.57 | |
![]() DeepSeek R1 (Fast) | 164k | 60 | $4.25 | 108.3 | 0.83 | ||
![]() DeepSeek R1 (Turbo, FP4) | 33k | 60 | $1.50 | 88.7 | 0.29 | ||
![]() DeepSeek R1 | 64k | 60 | $0.96 | 13.6 | 0.47 | ||
![]() DeepSeek R1 | 128k | 60 | $4.00 | 58.8 | 0.45 | ||
![]() | ![]() DeepSeek R1 Turbo | 64k | 60 | $1.15 | 32.8 | 0.78 | |
![]() | ![]() DeepSeek R1 | 64k | 60 | $4.00 | 32.4 | 0.81 | |
![]() | ![]() DeepSeek R1 | 16k | 60 | $5.50 | 256.4 | 2.22 | |
![]() DeepSeek R1 | 128k | 60 | $4.00 | 123.8 | 0.49 | ||
![]() | ![]() DeepSeek R1 | 128k | 60 | $7.00 | 25.2 | 0.68 | |
QwQ-32B | 131k | 58 | $0.20 | 120.5 | 1.23 | ||
QwQ-32B Base | 131k | 58 | $0.23 | 44.3 | 0.64 | ||
![]() | QwQ-32B | 131k | 58 | $0.65 | 82.2 | 0.51 | |
QwQ-32B | 131k | 58 | $0.90 | 131.1 | 0.63 | ||
QwQ-32B | 131k | 58 | $0.14 | 39.0 | 0.34 | ||
![]() | QwQ-32B | 33k | 58 | $0.18 | 22.4 | 1.17 | |
QwQ-32B | 131k | 58 | $0.32 | 398.6 | 0.12 | ||
![]() | QwQ-32B | 16k | 58 | $0.63 | 352.4 | 0.93 | |
QwQ-32B | 131k | 58 | $1.20 | 89.2 | 0.53 | ||
![]() | Claude 3.7 Sonnet Thinking | 200k | 57 | $6.00 | 57.4 | 0.00 | |
Claude 3.7 Sonnet Thinking | 200k | 57 | $6.00 | 78.5 | 0.97 | ||
o1-mini | 128k | 54 | $1.93 | 225.3 | 10.34 | ||
![]() | o1-mini | 128k | 54 | $2.12 | 194.5 | 14.47 | |
![]() DeepSeek R1 Distill Qwen 32B | 128k | 52 | $0.14 | 49.4 | 0.24 | ||
![]() | ![]() DeepSeek R1 Distill Qwen 32B | 64k | 52 | $0.30 | 20.9 | 1.12 | |
![]() DeepSeek R1 Distill Qwen 32B | 128k | 52 | $0.69 | 138.1 | 0.38 | ||
![]() | ![]() DeepSeek V3 (Mar' 25) | 64k | 52 | $0.48 | 32.9 | 4.07 | |
![]() DeepSeek V3 (Mar' 25) | 128k | 52 | $1.45 | 43.9 | 1.21 | ||
![]() DeepSeek V3 (Mar' 25) | 128k | 52 | $1.25 | 36.1 | 0.82 | ||
![]() DeepSeek V3 (Mar' 25) | 128k | 52 | $0.75 | 29.9 | 0.66 | ||
![]() | ![]() DeepSeek V3 (Mar' 25) | 164k | 52 | $0.80 | 76.3 | 0.57 | |
![]() DeepSeek V3 (Mar' 25) | 160k | 52 | $0.90 | 63.6 | 0.84 | ||
![]() DeepSeek V3 (Mar' 25) | 164k | 52 | $0.52 | 21.4 | 0.39 | ||
![]() | ![]() DeepSeek V3 (Mar' 25) | 64k | 52 | $1.20 | 31.7 | 0.65 | |
![]() | ![]() DeepSeek V3 (Mar' 25) | 8k | 52 | $1.13 | 267.5 | 2.21 | |
![]() | ![]() DeepSeek V3 (Mar' 25) | 128k | 52 | $1.25 | 27.1 | 0.70 | |
Gemini 2.0 Pro Experimental (AI Studio) | 2m | 49 | $0.00 | ||||
![]() | ![]() DeepSeek R1 Distill Qwen 14B | 64k | 49 | $0.15 | 45.2 | 0.71 | |
![]() DeepSeek R1 Distill Qwen 14B | 128k | 49 | $1.60 | 150.0 | 0.26 | ||
![]() DeepSeek R1 Distill Llama 70B | 128k | 48 | $0.30 | 67.3 | 0.50 | ||
![]() | ![]() DeepSeek R1 Distill Llama 70B | 66k | 48 | $0.94 | 1,928.1 | 0.18 | |
![]() DeepSeek R1 Distill Llama 70B Base | 128k | 48 | $0.38 | 42.4 | 0.63 | ||
![]() DeepSeek R1 Distill Llama 70B | 128k | 48 | $0.34 | 33.0 | 0.42 | ||
![]() | ![]() DeepSeek R1 Distill Llama 70B | 32k | 48 | $0.39 | 21.5 | 1.15 | |
![]() DeepSeek R1 Distill Llama 70B | 128k | 48 | $0.81 | 275.2 | 0.22 | ||
![]() DeepSeek R1 Distill Llama 70B (Spec decoding) | 128k | 48 | $0.81 | 1,365.6 | 0.41 | ||
![]() | ![]() DeepSeek R1 Distill Llama 70B | 16k | 48 | $0.88 | 124.0 | 0.98 | |
![]() DeepSeek R1 Distill Llama 70B | 128k | 48 | $2.00 | 113.0 | 0.33 | ||
![]() | Claude 3.7 Sonnet | 200k | 48 | $6.00 | 40.0 | 0.85 | |
Claude 3.7 Sonnet | 200k | 48 | $6.00 | 78.5 | 0.96 | ||
Gemini 2.0 Flash Vertex | 1m | 48 | $0.26 | 247.4 | 0.31 | ||
Gemini 2.0 Flash (AI Studio) | 1m | 48 | $0.17 | 258.4 | 0.34 | ||
![]() | ![]() Reka Flash 3 | 128k | 47 | $0.35 | 56.7 | 0.95 | |
![]() | ![]() DeepSeek V3 | 66k | 46 | $0.48 | 29.8 | 4.02 | |
![]() DeepSeek V3 (FP8) | 128k | 46 | $0.25 | 28.0 | 1.13 | ||
![]() DeepSeek V3 | 128k | 46 | $0.75 | 18.1 | 0.70 | ||
![]() | ![]() DeepSeek V3 | 128k | 46 | $2.00 | 84.2 | 0.59 | |
![]() DeepSeek V3 | 128k | 46 | $1.31 | 64.3 | 0.86 | ||
![]() DeepSeek V3 | 64k | 46 | $0.59 | 15.2 | 0.59 | ||
![]() | ![]() DeepSeek V3 Turbo | 64k | 46 | $0.63 | 28.8 | 0.86 | |
![]() | ![]() DeepSeek V3 | 64k | 46 | $0.89 | 29.0 | 0.97 | |
![]() DeepSeek V3 (FP8) | 128k | 46 | $1.25 | 39.1 | 0.73 | ||
Qwen2.5 Max | 32k | 45 | $2.80 | 35.8 | 1.13 | ||
Gemini 1.5 Pro (Sep) (Vertex) | 2m | 45 | $2.19 | 94.4 | 0.51 | ||
Gemini 1.5 Pro (Sep) (AI Studio) | 2m | 45 | $2.19 | 95.6 | 0.47 | ||
![]() | Claude 3.5 Sonnet (Oct) | 200k | 44 | $6.00 | 47.5 | 0.93 | |
Claude 3.5 Sonnet (Oct) Vertex | 200k | 44 | $6.00 | 79.4 | 0.92 | ||
Claude 3.5 Sonnet (Oct) | 200k | 44 | $6.00 | 79.0 | 1.40 | ||
![]() Sonar | 127k | 43 | $1.00 | 71.5 | 2.25 | ||
![]() Sonar Pro | 200k | 43 | $6.00 | 80.7 | 3.21 | ||
QwQ 32B-Preview | 33k | 43 | $0.20 | 58.9 | 0.85 | ||
QwQ 32B-Preview | 33k | 43 | $0.14 | 60.4 | 0.63 | ||
QwQ 32B-Preview | 33k | 43 | $0.90 | 53.1 | 0.52 | ||
QwQ 32B-Preview | 33k | 43 | $0.26 | 50.1 | 0.27 | ||
![]() | QwQ 32B-Preview | 16k | 43 | $1.88 | 387.1 | 0.78 | |
QwQ 32B-Preview | 33k | 43 | $1.20 | 89.9 | 0.46 | ||
GPT-4o (Nov '24) | 128k | 41 | $4.38 | 138.5 | 0.51 | ||
![]() | GPT-4o (Nov '24) | 128k | 41 | $4.38 | 115.8 | 0.96 | |
Gemini 2.0 Flash-Lite (Feb '25) (AI Studio) | 1m | 41 | $0.13 | 206.2 | 0.26 | ||
Llama 3.3 70B (FP8) | 128k | 41 | $0.17 | 34.7 | 0.57 | ||
![]() | Llama 3.3 70B | 33k | 41 | $0.94 | 2,597.0 | 0.16 | |
Llama 3.3 70B | 128k | 41 | $0.40 | 40.4 | 1.56 | ||
![]() | Llama 3.3 70B | 128k | 41 | $0.71 | 139.0 | 0.58 | |
Llama 3.3 70B Fast | 128k | 41 | $0.38 | 135.2 | 0.55 | ||
Llama 3.3 70B Base | 128k | 41 | $0.20 | 17.5 | 0.76 | ||
![]() | Llama 3.3 70B | 128k | 41 | $0.50 | 135.1 | 0.51 | |
![]() | Llama 3.3 70B | 128k | 41 | $0.71 | 47.9 | 0.45 | |
Llama 3.3 70B | 128k | 41 | $0.90 | 146.6 | 0.50 | ||
Llama 3.3 70B (Turbo, FP8) | 128k | 41 | $0.20 | 40.0 | 0.38 | ||
Llama 3.3 70B | 128k | 41 | $0.27 | 23.7 | 0.57 | ||
Llama 3.3 70B | 128k | 41 | $0.60 | 157.8 | 0.36 | ||
![]() | Llama 3.3 70B | 128k | 41 | $0.39 | 29.8 | 0.88 | |
Llama 3.3 70B (Spec decoding) | 8k | 41 | $0.69 | 1,554.8 | 0.41 | ||
Llama 3.3 70B | 128k | 41 | $0.64 | 275.6 | 0.38 | ||
![]() | Llama 3.3 70B | 128k | 41 | $0.75 | 450.5 | 0.33 | |
Llama 3.3 70B Turbo | 128k | 41 | $0.88 | 166.1 | 0.45 | ||
![]() | Llama 3.3 70B | 128k | 41 | $0.70 | 9.7 | 1.08 | |
GPT-4o (May '24) | 128k | 41 | $7.50 | 137.0 | 0.40 | ||
![]() | GPT-4o (May '24) | 128k | 41 | $7.50 | 69.7 | 1.09 | |
Llama 3.1 405B (FP8) | 128k | 40 | $0.80 | 34.8 | 0.65 | ||
Llama 3.1 405B | 128k | 40 | $9.50 | 19.1 | 0.97 | ||
Llama 3.1 405B | 128k | 40 | $4.00 | 12.9 | 0.79 | ||
![]() | Llama 3.1 405B Standard | 128k | 40 | $2.40 | 30.7 | 1.84 | |
![]() | Llama 3.1 405B Latency Optimized | 128k | 40 | $3.00 | 64.6 | 0.73 | |
Llama 3.1 405B Base | 128k | 40 | $1.50 | 32.8 | 0.72 | ||
Llama 3.1 405B Vertex | 128k | 40 | $7.75 | 29.7 | 0.39 | ||
![]() | Llama 3.1 405B | 128k | 40 | $8.00 | 31.0 | 0.49 | |
Llama 3.1 405B | 128k | 40 | $3.00 | 84.6 | 0.64 | ||
Llama 3.1 405B | 33k | 40 | $0.90 | 24.7 | 0.48 | ||
![]() | Llama 3.1 405B | 16k | 40 | $6.25 | 170.3 | 1.24 | |
Llama 3.1 405B | 128k | 40 | $7.50 | 37.6 | 0.72 | ||
Llama 3.1 405B Turbo | 128k | 40 | $3.50 | 114.9 | 0.68 | ||
![]() | Llama 3.1 405B | 128k | 40 | $3.50 | 14.0 | 1.05 | |
Qwen2.5 72B | 131k | 40 | $0.40 | 20.6 | 1.57 | ||
Qwen2.5 72B | 131k | 40 | $0.20 | 30.2 | 0.73 | ||
Qwen2.5 72B Fast | 131k | 40 | $0.38 | 66.7 | 0.58 | ||
Qwen2.5 72B | 131k | 40 | $0.90 | 41.1 | 0.43 | ||
Qwen2.5 72B | 33k | 40 | $0.27 | 39.3 | 0.52 | ||
![]() | Qwen2.5 72B | 16k | 40 | $0.94 | 243.9 | 0.84 | |
Qwen2.5 72B Turbo | 131k | 40 | $1.20 | 87.5 | 0.53 | ||
Qwen2.5 72B | 131k | 40 | $0.00 | 61.3 | 1.08 | ||
![]() | ![]() MiniMax-Text-01 | 1m | 40 | $0.42 | 34.1 | 0.93 | |
Phi-4 | 16k | 40 | $0.15 | 118.0 | 0.53 | ||
![]() | Phi-4 | 16k | 40 | $0.22 | 37.6 | 0.47 | |
Phi-4 | 16k | 40 | $0.09 | 39.8 | 0.61 | ||
![]() Command A | 256k | 40 | $4.38 | 176.1 | 0.24 | ||
![]() | ![]() Tulu3 405B | 16k | 40 | $6.25 | 184.6 | 1.27 | |
![]() | ![]() Mistral Large 2 (Nov '24) | 128k | 38 | $3.00 | 30.3 | 0.52 | |
![]() | ![]() Mistral Large 2 (Nov '24) | 128k | 38 | $3.00 | 35.0 | 0.53 | |
Gemma 3 27B (AI_Studio) | 128k | 38 | $0.00 | 59.4 | 0.72 | ||
Gemma 3 27B | 128k | 38 | $0.07 | 45.9 | 0.48 | ||
Grok Beta | 128k | 38 | $7.50 | 61.3 | 0.30 | ||
![]() | ![]() Pixtral Large | 128k | 37 | $3.00 | 27.0 | 0.45 | |
Qwen2.5 Instruct 32B Fast | 128k | 37 | $0.20 | 82.5 | 0.54 | ||
Qwen2.5 Instruct 32B Base | 128k | 37 | $0.10 | 59.7 | 0.57 | ||
Qwen2.5 Instruct 32B | 128k | 37 | $0.79 | 198.2 | 0.23 | ||
Llama 3.1 Nemotron 70B (FP8) | 128k | 37 | $0.17 | 35.5 | 0.62 | ||
Llama 3.1 Nemotron 70B Base | 128k | 37 | $0.20 | 40.0 | 0.65 | ||
Llama 3.1 Nemotron 70B Fast | 128k | 37 | $0.38 | 73.0 | 0.55 | ||
Llama 3.1 Nemotron 70B | 128k | 37 | $0.27 | 29.8 | 0.60 | ||
![]() | ![]() Nova Pro | 300k | 37 | $1.40 | 99.2 | 0.36 | |
![]() | ![]() Nova Pro Latency Optimized | 300k | 37 | $1.75 | 126.8 | 0.64 | |
![]() | ![]() Mistral Large 2 (Jul '24) | 128k | 37 | $3.00 | 37.2 | 0.50 | |
![]() | ![]() Mistral Large 2 (Jul '24) | 128k | 37 | $3.00 | 33.0 | 0.45 | |
![]() | ![]() Mistral Large 2 (Jul '24) | 128k | 37 | $3.00 | 35.6 | 0.51 | |
Qwen2.5 Coder 32B | 33k | 36 | $0.09 | 63.1 | 0.54 | ||
Qwen2.5 Coder 32B | 131k | 36 | $0.20 | 58.1 | 0.74 | ||
Qwen2.5 Coder 32B | 33k | 36 | $0.90 | 58.1 | 0.39 | ||
Qwen2.5 Coder 32B | 33k | 36 | $0.10 | 48.4 | 0.58 | ||
Qwen2.5 Coder 32B | 131k | 36 | $0.79 | 197.8 | 0.37 | ||
![]() | Qwen2.5 Coder 32B | 16k | 36 | $0.63 | 339.9 | 0.74 | |
Qwen2.5 Coder 32B | 131k | 36 | $0.80 | 73.6 | 0.45 | ||
GPT-4o mini | 128k | 36 | $0.26 | 71.2 | 0.36 | ||
![]() | GPT-4o mini | 128k | 36 | $0.26 | 144.4 | 0.99 | |
Llama 3.1 70B (FP8) | 128k | 35 | $0.17 | 36.2 | 0.57 | ||
Llama 3.1 70B | 128k | 35 | $0.40 | 56.7 | 1.12 | ||
![]() | Llama 3.1 70B Standard | 128k | 35 | $0.72 | 31.5 | 0.66 | |
![]() | Llama 3.1 70B Latency Optimized | 128k | 35 | $0.90 | 140.0 | 0.34 | |
Llama 3.1 70B Base | 128k | 35 | $0.20 | 23.0 | 0.69 | ||
Llama 3.1 70B Fast | 128k | 35 | $0.38 | 91.9 | 0.57 | ||
Llama 3.1 70B Vertex | 128k | 35 | $0.00 | 73.0 | 0.26 | ||
![]() | Llama 3.1 70B | 128k | 35 | $2.90 | 58.2 | 0.44 | |
Llama 3.1 70B | 128k | 35 | $0.90 | 148.4 | 0.45 | ||
Llama 3.1 70B (Turbo, FP8) | 128k | 35 | $0.20 | 29.2 | 0.45 | ||
Llama 3.1 70B | 128k | 35 | $0.27 | 36.2 | 0.34 | ||
Llama 3.1 70B | 128k | 35 | $0.60 | 194.7 | 0.34 | ||
![]() | Llama 3.1 70B | 32k | 35 | $0.35 | 42.2 | 1.14 | |
![]() | Llama 3.1 70B | 128k | 35 | $0.75 | 450.4 | 0.30 | |
Llama 3.1 70B | 128k | 35 | $1.50 | 54.1 | 0.50 | ||
Llama 3.1 70B Turbo | 128k | 35 | $0.88 | 188.0 | 0.44 | ||
Llama 3.1 70B | 128k | 35 | $0.90 | 125.6 | 0.51 | ||
![]() | ![]() Mistral Small 3.1 | 128k | 35 | $0.15 | 144.8 | 0.33 | |
![]() Mistral Small 3.1 Vertex | 128k | 35 | $0.15 | 207.6 | 0.18 | ||
![]() | ![]() Mistral Small 3 | 32k | 35 | $0.15 | 149.6 | 0.33 | |
![]() Mistral Small 3 | 32k | 35 | $0.90 | 39.7 | 0.55 | ||
![]() Mistral Small 3 | 32k | 35 | $0.09 | 70.0 | 0.26 | ||
![]() Mistral Small 3 | 32k | 35 | $0.80 | 95.2 | 0.22 | ||
![]() | Claude 3 Opus | 200k | 35 | $30.00 | 22.4 | 1.27 | |
Claude 3 Opus Vertex | 200k | 35 | $30.00 | 27.2 | 1.13 | ||
Claude 3 Opus | 200k | 35 | $30.00 | 28.0 | 1.27 | ||
![]() | Claude 3.5 Haiku Standard | 200k | 35 | $1.60 | 42.8 | 1.22 | |
![]() | Claude 3.5 Haiku Latency Optimized | 200k | 35 | $2.00 | 94.0 | 0.52 | |
Claude 3.5 Haiku Vertex | 200k | 35 | $1.60 | 65.4 | 0.62 | ||
Claude 3.5 Haiku | 200k | 35 | $1.60 | 65.5 | 1.60 | ||
![]() | ![]() DeepSeek R1 Distill Llama 8B | 32k | 34 | $0.04 | 45.2 | 0.63 | |
Gemini 1.5 Pro (May) (Vertex) | 2m | 34 | $2.19 | 66.2 | 0.42 | ||
Gemini 1.5 Pro (May) (AI Studio) | 2m | 34 | $2.19 | 65.9 | 0.44 | ||
Qwen Turbo | 1m | 34 | $0.09 | 99.8 | 1.10 | ||
![]() | Llama 3.2 90B (Vision) | 128k | 33 | $0.72 | 57.8 | 0.37 | |
Llama 3.2 90B (Vision) Vertex | 128k | 33 | $0.00 | 32.7 | 0.19 | ||
Llama 3.2 90B (Vision) | 128k | 33 | $0.90 | 42.2 | 0.40 | ||
Llama 3.2 90B (Vision) | 33k | 33 | $0.36 | 33.7 | 0.47 | ||
Llama 3.2 90B (Vision) | 8k | 33 | $0.90 | 263.6 | 0.33 | ||
Llama 3.2 90B (Vision) Turbo | 128k | 33 | $1.20 | 34.1 | 0.32 | ||
Qwen2 72B | 33k | 33 | $0.90 | 66.6 | 0.35 | ||
![]() | ![]() Nova Lite | 300k | 33 | $0.10 | 285.9 | 0.33 | |
Gemini 1.5 Flash-8B AI Studio | 1m | 31 | $0.07 | 291.5 | 0.20 | ||
![]() Jamba 1.5 Large | 256k | 29 | $3.50 | 62.8 | 0.65 | ||
![]() | ![]() Jamba 1.5 Large | 256k | 29 | $3.50 | 51.0 | 0.69 | |
![]() Jamba 1.6 Large | 256k | 29 | $3.50 | 67.2 | 0.65 | ||
Gemini 1.5 Flash (May) (Vertex) | 1m | 28 | $0.13 | 303.4 | 0.29 | ||
Gemini 1.5 Flash (May) (AI Studio) | 1m | 28 | $0.13 | 298.8 | 0.21 | ||
![]() | ![]() Nova Micro | 130k | 28 | $0.06 | 317.6 | 0.31 | |
![]() Yi-Large | 32k | 28 | $3.00 | 68.8 | 0.46 | ||
![]() | Claude 3 Sonnet | 200k | 28 | $6.00 | 50.4 | 0.77 | |
Claude 3 Sonnet | 200k | 28 | $6.00 | 59.4 | 0.56 | ||
![]() | ![]() Codestral (Jan '25) | 256k | 28 | $0.45 | 205.8 | 0.31 | |
![]() Codestral (Jan '25) Vertex | 128k | 28 | $0.45 | 155.3 | 0.16 | ||
Llama 3 70B | 8k | 27 | $1.18 | 44.4 | 0.41 | ||
Llama 3 70B | 8k | 27 | $0.40 | 23.6 | 1.26 | ||
![]() | Llama 3 70B | 8k | 27 | $2.86 | 49.1 | 0.42 | |
![]() | Llama 3 70B | 8k | 27 | $2.90 | 18.9 | 0.77 | |
Llama 3 70B | 8k | 27 | $0.90 | 141.8 | 0.37 | ||
Llama 3 70B | 8k | 27 | $0.27 | 42.9 | 0.40 | ||
![]() | Llama 3 70B | 8k | 27 | $0.57 | 21.1 | 1.15 | |
Llama 3 70B | 8k | 27 | $0.64 | 331.5 | 0.25 | ||
Llama 3 70B (Reference, FP16) | 8k | 27 | $0.90 | 151.1 | 0.50 | ||
Llama 3 70B (Turbo, FP8) | 8k | 27 | $0.88 | 35.0 | 0.33 | ||
![]() | ![]() Mistral Small (Sep '24) | 33k | 27 | $0.30 | 85.8 | 0.34 | |
![]() | Phi-4 Multimodal | 128k | 27 | $0.00 | 26.6 | 0.33 | |
Qwen2.5 Coder 7B Fast | 131k | 27 | $0.04 | 221.9 | 0.48 | ||
Qwen2.5 Coder 7B Base | 131k | 27 | $0.01 | 195.4 | 0.51 | ||
![]() | ![]() Mistral Large (Feb '24) | 33k | 26 | $6.00 | 31.7 | 0.49 | |
![]() | ![]() Mistral Large (Feb '24) | 33k | 26 | $6.00 | 41.2 | 0.39 | |
![]() | ![]() Mistral Large (Feb '24) | 33k | 26 | $6.00 | 39.9 | 0.49 | |
![]() | ![]() Mixtral 8x22B | 65k | 26 | $3.00 | 77.7 | 0.39 | |
![]() Mixtral 8x22B Base | 65k | 26 | $0.60 | 89.3 | 0.54 | ||
![]() Mixtral 8x22B Fast | 65k | 26 | $1.05 | 104.6 | 0.55 | ||
![]() Mixtral 8x22B | 65k | 26 | $1.20 | 82.1 | 0.39 | ||
![]() Mixtral 8x22B | 65k | 26 | $1.20 | 70.6 | 0.45 | ||
![]() | Phi-4 Mini | 128k | 26 | $0.12 | 201.3 | 0.43 | |
![]() | Phi-4 Mini | 128k | 26 | $0.00 | 52.6 | 0.34 | |
![]() | Phi-3 Medium 14B | 128k | 25 | $0.30 | 52.7 | 0.42 | |
![]() | Claude 2.1 | 200k | 24 | $12.00 | 22.6 | 1.81 | |
Claude 2.1 | 200k | 24 | $12.00 | 13.9 | 0.84 | ||
Llama 3.1 8B | 128k | 24 | $0.03 | 133.0 | 0.41 | ||
![]() | Llama 3.1 8B | 33k | 24 | $0.10 | 2,180.3 | 0.28 | |
Llama 3.1 8B | 128k | 24 | $0.10 | 70.6 | 0.90 | ||
![]() | Llama 3.1 8B | 128k | 24 | $0.22 | 90.8 | 0.37 | |
Llama 3.1 8B Fast | 128k | 24 | $0.04 | 177.4 | 0.50 | ||
Llama 3.1 8B Base | 128k | 24 | $0.03 | 66.2 | 0.54 | ||
Llama 3.1 8B Vertex | 128k | 24 | $0.00 | 118.9 | 0.17 | ||
![]() | Llama 3.1 8B | 128k | 24 | $0.38 | 221.1 | 0.29 | |
Llama 3.1 8B | 128k | 24 | $0.20 | 208.7 | 0.33 | ||
Llama 3.1 8B | 128k | 24 | $0.04 | 55.1 | 0.51 | ||
Llama 3.1 8B | 128k | 24 | $0.10 | 483.4 | 0.30 | ||
![]() | Llama 3.1 8B | 16k | 24 | $0.05 | 67.3 | 0.74 | |
Llama 3.1 8B | 128k | 24 | $0.06 | 751.1 | 0.17 | ||
![]() | Llama 3.1 8B | 16k | 24 | $0.13 | 1,076.6 | 0.24 | |
Llama 3.1 8B Turbo | 128k | 24 | $0.18 | 265.0 | 0.23 | ||
Llama 3.1 8B | 128k | 24 | $0.15 | 460.5 | 0.18 | ||
![]() | Llama 3.1 8B | 128k | 24 | $0.18 | 60.2 | 0.52 | |
![]() | ![]() Pixtral 12B | 128k | 23 | $0.15 | 105.3 | 0.34 | |
![]() Pixtral 12B | 128k | 23 | $0.10 | 70.8 | 0.47 | ||
![]() | ![]() Mistral Small (Feb '24) | 33k | 23 | $1.50 | 141.1 | 0.32 | |
![]() | ![]() Mistral Small (Feb '24) | 33k | 23 | $1.50 | 87.4 | 0.40 | |
![]() | ![]() Mistral Medium | 33k | 23 | $4.09 | 41.1 | 0.42 | |
![]() | ![]() Ministral 8B | 128k | 22 | $0.10 | 142.7 | 0.33 | |
Gemma 2 9B Fast | 8k | 22 | $0.04 | 172.0 | 0.52 | ||
Gemma 2 9B Base | 8k | 22 | $0.03 | 166.0 | 0.52 | ||
Gemma 2 9B | 8k | 22 | $0.04 | 49.3 | 0.33 | ||
Gemma 2 9B | 8k | 22 | $0.20 | 652.8 | 0.22 | ||
Gemma 2 9B | 8k | 22 | $0.30 | 130.6 | 0.23 | ||
![]() LFM 40B | 32k | 22 | $0.15 | 164.5 | 0.23 | ||
![]() | ![]() Command-R+ | 128k | 21 | $6.00 | 47.0 | 0.47 | |
![]() Command-R+ | 128k | 21 | $4.38 | 49.8 | 0.27 | ||
Llama 3 8B | 8k | 21 | $0.10 | 75.2 | 0.40 | ||
![]() | Llama 3 8B | 8k | 21 | $0.38 | 103.7 | 0.30 | |
![]() | Llama 3 8B | 8k | 21 | $0.38 | 73.6 | 0.38 | |
Llama 3 8B | 8k | 21 | $0.20 | 127.6 | 0.28 | ||
Llama 3 8B | 8k | 21 | $0.04 | 110.0 | 0.52 | ||
![]() | Llama 3 8B | 8k | 21 | $0.04 | 48.3 | 0.88 | |
Llama 3 8B | 8k | 21 | $0.06 | 1,198.3 | 0.30 | ||
Llama 3 8B | 8k | 21 | $0.20 | 259.2 | 0.36 | ||
Gemini 1.0 Pro Vertex | 33k | 21 | $0.19 | 161.3 | 0.40 | ||
![]() | ![]() Codestral (May '24) | 33k | 20 | $0.30 | 105.6 | 0.35 | |
![]() Aya Expanse 32B | 128k | 20 | $0.75 | 119.3 | 0.15 | ||
![]() | ![]() Command-R+ (Apr '24) | 128k | 20 | $6.00 | 47.2 | 0.49 | |
![]() Command-R+ (Apr '24) | 128k | 20 | $6.00 | 56.5 | 0.23 | ||
![]() | ![]() Command-R+ (Apr '24) | 128k | 20 | $6.00 | 50.1 | 0.56 | |
![]() DBRX | 33k | 20 | $1.13 | 69.2 | 0.49 | ||
![]() | ![]() Ministral 3B | 128k | 20 | $0.04 | 222.8 | 0.33 | |
![]() | ![]() Mistral NeMo | 128k | 20 | $0.15 | 138.4 | 0.33 | |
![]() Mistral NeMo Fast | 128k | 20 | $0.12 | 161.3 | 0.51 | ||
![]() Mistral NeMo Base | 128k | 20 | $0.06 | 23.8 | 0.66 | ||
![]() Mistral NeMo | 128k | 20 | $0.06 | 54.0 | 0.55 | ||
Llama 3.2 3B (FP8) | 128k | 20 | $0.02 | 224.0 | 0.38 | ||
Llama 3.2 3B | 128k | 20 | $0.10 | 164.8 | 0.69 | ||
![]() | Llama 3.2 3B | 128k | 20 | $0.15 | 72.1 | 0.33 | |
Llama 3.2 3B Base | 128k | 20 | $0.01 | 123.6 | 0.50 | ||
![]() | Llama 3.2 3B | 128k | 20 | $0.06 | 227.4 | 0.43 | |
Llama 3.2 3B | 128k | 20 | $0.10 | 166.2 | 0.25 | ||
Llama 3.2 3B | 128k | 20 | $0.02 | 120.5 | 0.24 | ||
![]() | Llama 3.2 3B | 32k | 20 | $0.04 | 77.3 | 0.69 | |
Llama 3.2 3B | 8k | 20 | $0.06 | 1,534.0 | 0.34 | ||
![]() | Llama 3.2 3B | 8k | 20 | $0.10 | 1,529.6 | 0.21 | |
Llama 3.2 3B Turbo | 128k | 20 | $0.06 | 107.4 | 0.54 | ||
![]() DeepSeek R1 Distill Qwen 1.5B | 128k | 19 | $0.18 | 381.9 | 0.18 | ||
![]() Jamba 1.5 Mini | 256k | 18 | $0.25 | 170.8 | 0.46 | ||
![]() | ![]() Jamba 1.5 Mini | 256k | 18 | $0.25 | 82.4 | 0.48 | |
![]() Jamba 1.6 Mini | 256k | 18 | $0.25 | 197.1 | 0.44 | ||
![]() | ![]() Mixtral 8x7B | 33k | 17 | $0.70 | 86.1 | 0.33 | |
![]() | ![]() Mixtral 8x7B | 33k | 17 | $0.51 | 59.9 | 0.33 | |
![]() Mixtral 8x7B Fast | 33k | 17 | $0.23 | 162.2 | 0.51 | ||
![]() Mixtral 8x7B Base | 33k | 17 | $0.12 | 53.3 | 0.60 | ||
![]() Mixtral 8x7B | 33k | 17 | $0.50 | 177.7 | 0.28 | ||
![]() Mixtral 8x7B | 33k | 17 | $0.24 | 98.8 | 0.34 | ||
![]() Mixtral 8x7B | 33k | 17 | $0.63 | 94.0 | 0.45 | ||
![]() Mixtral 8x7B | 33k | 17 | $0.60 | 80.0 | 0.31 | ||
![]() Aya Expanse 8B | 8k | 16 | $0.75 | 165.7 | 0.12 | ||
![]() | ![]() Command-R | 128k | 15 | $0.75 | 109.1 | 0.35 | |
![]() Command-R | 128k | 15 | $0.26 | 63.0 | 0.19 | ||
![]() | ![]() Command-R (Mar '24) | 128k | 15 | $0.75 | 108.3 | 0.35 | |
![]() Command-R (Mar '24) | 128k | 15 | $0.75 | 116.3 | 0.16 | ||
![]() | ![]() Command-R (Mar '24) | 128k | 15 | $0.75 | 80.7 | 0.44 | |
![]() | ![]() Codestral-Mamba | 256k | 14 | $0.25 | 95.5 | 0.49 | |
![]() | ![]() Mistral 7B | 8k | 10 | $0.25 | 112.6 | 0.34 | |
![]() | ![]() Mistral 7B | 8k | 10 | $0.16 | 92.9 | 0.32 | |
![]() Mistral 7B | 8k | 10 | $0.04 | 81.1 | 0.37 | ||
![]() | ![]() Mistral 7B | 32k | 10 | $0.06 | 104.3 | 0.80 | |
![]() Mistral 7B | 8k | 10 | $0.20 | 169.2 | 0.19 | ||
![]() | Llama 3.2 1B | 128k | 10 | $0.10 | 119.0 | 0.32 | |
Llama 3.2 1B Base | 128k | 10 | $0.01 | 268.5 | 0.47 | ||
Llama 3.2 1B | 128k | 10 | $0.01 | 128.4 | 0.26 | ||
Llama 3.2 1B | 8k | 10 | $0.04 | 3,285.6 | 0.46 | ||
![]() | Llama 3.2 1B | 16k | 10 | $0.05 | 2,097.5 | 0.22 | |
Llama 2 Chat 7B | 4k | 8 | $0.10 | 124.1 | 0.52 | ||
o1-preview | 128k | $26.25 | 134.8 | 28.54 | |||
![]() | o1-preview | 128k | $28.88 | 121.2 | 32.02 | ||
GPT-4o (Aug '24) | 128k | $4.38 | 127.0 | 0.42 | |||
![]() | GPT-4o (Aug '24) | 128k | $4.38 | 82.8 | 0.85 | ||
GPT-4.5 (Preview) | 128k | $93.75 | 12.8 | 1.33 | |||
o1-pro | 200k | $262.50 | 25.2 | 106.12 | |||
![]() | Llama 3.2 11B (Vision) | 128k | $0.16 | 142.6 | 0.34 | ||
![]() | Llama 3.2 11B (Vision) | 128k | $0.15 | 83.8 | 0.44 | ||
Llama 3.2 11B (Vision) | 128k | $0.20 | 106.4 | 0.29 | |||
Llama 3.2 11B (Vision) | 128k | $0.06 | 50.4 | 0.50 | |||
Llama 3.2 11B (Vision) | 8k | $0.18 | 749.7 | 0.16 | |||
Llama 3.2 11B (Vision) Turbo | 128k | $0.18 | 155.6 | 0.20 | |||
Gemini 2.0 Flash (exp) (AI Studio) | 1m | $0.00 | 249.6 | 0.24 | |||
Gemini 1.5 Flash (Sep) (Vertex) | 1m | $0.13 | 186.7 | 0.23 | |||
Gemini 1.5 Flash (Sep) (AI Studio) | 1m | $0.13 | 171.8 | 0.31 | |||
Gemma 2 27B Fast | 8k | $0.26 | 85.9 | 0.56 | |||
Gemma 2 27B Base | 8k | $0.15 | 53.3 | 0.58 | |||
Gemma 2 27B | 8k | $0.80 | 89.8 | 0.29 | |||
Gemini 2.5 Pro Experimental | 1m | $0.00 | 190.4 | 30.82 | |||
![]() | Claude 3.5 Sonnet (June) | 200k | $6.00 | 44.6 | 1.09 | ||
Claude 3.5 Sonnet (June) Vertex | 200k | $6.00 | 78.8 | 0.92 | |||
Claude 3.5 Sonnet (June) | 200k | $6.00 | 79.5 | 0.83 | |||
![]() | Claude 3 Haiku | 200k | $0.50 | 104.7 | 1.04 | ||
Claude 3 Haiku | 200k | $0.50 | 139.4 | 0.73 | |||
![]() | ![]() Mistral Saba | 32k | $0.30 | 100.9 | 0.40 | ||
![]() Mistral Saba | 32k | $0.79 | 388.5 | 0.34 | |||
![]() DeepSeek Coder V2 Lite Fast, FP8 | 128k | $0.12 | 113.2 | 0.60 | |||
![]() DeepSeek Coder V2 Lite Base, FP8 | 128k | $0.06 | 108.6 | 0.59 | |||
![]() Sonar Reasoning | 127k | $2.00 | 85.3 | 1.92 | |||
![]() | ![]() Solar Mini | 4k | $0.15 | 47.4 | 1.05 | ||
Qwen1.5 Chat 110B | 32k | $0.00 | 29.6 | 1.11 | |||
GPT-4 Turbo | 128k | $15.00 | 30.7 | 0.83 | |||
![]() | GPT-4 Turbo | 128k | $15.00 | 47.4 | 1.57 | ||
GPT-4 | 8k | $37.50 | 25.1 | 0.69 | |||
GPT-4o (ChatGPT) | 128k | $7.50 | 103.0 | 0.47 | |||
Gemini 2.0 Flash-Lite (Preview) (AI Studio) | 1m | $0.13 | 208.0 | 0.26 | |||
Claude 2.0 | 100k | $12.00 | 31.0 | 0.87 | |||
![]() OpenChat 3.5 | 8k | $0.06 | 82.4 | 0.31 | |||
![]() Jamba Instruct | 256k | $0.55 | 169.3 | 0.43 |
Key definitions
Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Latency: Time to first token of tokens received, in seconds, after API request sent. For models which do not support streaming, this represents time to receive the completion.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Input Price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Time period: Metrics are 'live' and are based on the past 72 hours of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.