LLM API Providers Leaderboard - Comparison of over 100 LLM endpoints
Comparison and ranking of API provider performance for over 100 AI LLM Model endpoints across performance key metrics including price, output speed, latency, context window & others. For more details including relating to our methodology, see our FAQs.
API providers compared: OpenAI, Playground AI, Mistral, Ideogram, Microsoft Azure, Amazon Bedrock, Hyperbolic, DeepSeek, Groq, FriendliAI, Together.ai, Anthropic, Black Forest Labs, Perplexity, Google, Fireworks, Lambda Labs, Leonardo.Ai, Cerebras, Recraft AI, Cohere, Upstage, Simplismart, Speechmatics, Fish Audio, Deepinfra, Replicate, , Genmo, Nebius, Adobe, MiniMax, CentML, StepFun, Runpod, Zyphra, Murf AI, Rev AI, Speechify, fal.ai, AssemblyAI, Avian, Rime, kluster.ai, Prodia, Reka AI, Hume AI, Deepgram, Gladia, Baseten, Stability.ai, Midjourney, Halfmoon, Databricks, ElevenLabs, IBM, SambaNova, xAI, Cartesia, LMNT, PlayAI, 01.AI, Alibaba Cloud, Novita, and AI21 Labs.
Features | Model Intelligence | Price | Output tokens/s | Latency | |||
---|---|---|---|---|---|---|---|
Further Analysis | |||||||
![]() | o3-mini (high) | 200k | 66 | $1.93 | 148.4 | 51.92 | |
o3-mini | 200k | 63 | $1.93 | 184.8 | 13.16 | ||
![]() | o3-mini | 200k | 63 | $1.93 | 143.1 | 20.10 | |
o1 | 200k | 62 | $26.25 | 109.9 | 27.68 | ||
![]() | o1 | 200k | 62 | $26.25 | 113.3 | 24.94 | |
![]() | ![]() DeepSeek R1 | 64k | 60 | $0.96 | 24.2 | 5.74 | |
![]() DeepSeek R1 | 128k | 60 | $2.00 | 101.1 | 1.44 | ||
![]() | ![]() DeepSeek R1 | 128k | 60 | $2.36 | 56.6 | 0.44 | |
![]() DeepSeek R1 Base | 128k | 60 | $1.20 | 24.1 | 0.67 | ||
![]() DeepSeek R1 Fast | 128k | 60 | $3.00 | 59.4 | 0.67 | ||
![]() | ![]() DeepSeek R1 | 128k | 60 | $3.99 | 69.3 | 0.58 | |
![]() | ![]() DeepSeek R1 | 128k | 60 | $2.36 | 74.4 | 0.61 | |
![]() DeepSeek R1 (Fast) | 164k | 60 | $4.25 | 107.0 | 0.75 | ||
![]() DeepSeek R1 (Turbo, FP4) | 33k | 60 | $1.50 | 47.0 | 0.24 | ||
![]() DeepSeek R1 | 64k | 60 | $0.96 | 13.8 | 0.54 | ||
![]() DeepSeek R1 | 128k | 60 | $4.00 | 65.4 | 0.43 | ||
![]() | ![]() DeepSeek R1 Turbo | 64k | 60 | $1.15 | 33.1 | 0.72 | |
![]() | ![]() DeepSeek R1 | 64k | 60 | $4.00 | 32.9 | 0.84 | |
![]() | ![]() DeepSeek R1 | 16k | 60 | $5.50 | 255.5 | 2.26 | |
![]() DeepSeek R1 | 128k | 60 | $4.00 | 102.5 | 0.55 | ||
![]() | ![]() DeepSeek R1 | 128k | 60 | $7.00 | 21.9 | 0.68 | |
QwQ-32B | 131k | 58 | $0.20 | 132.3 | 1.04 | ||
QwQ-32B Fast | 131k | 58 | $0.75 | 80.3 | 0.53 | ||
QwQ-32B Base | 131k | 58 | $0.23 | 34.9 | 0.63 | ||
![]() | QwQ-32B | 131k | 58 | $0.65 | 73.4 | 0.52 | |
QwQ-32B | 131k | 58 | $0.90 | 143.8 | 0.58 | ||
QwQ-32B | 131k | 58 | $0.14 | 40.0 | 0.37 | ||
![]() | QwQ-32B | 33k | 58 | $0.18 | 29.1 | 1.08 | |
QwQ-32B | 131k | 58 | $0.32 | 398.8 | 0.10 | ||
![]() | QwQ-32B | 16k | 58 | $0.63 | 341.7 | 0.92 | |
QwQ-32B | 131k | 58 | $1.20 | 95.9 | 0.48 | ||
![]() | Claude 3.7 Sonnet Thinking | 200k | 57 | $6.00 | 66.8 | 0.00 | |
Claude 3.7 Sonnet Thinking | 200k | 57 | $6.00 | 78.8 | 1.06 | ||
o1-mini | 128k | 54 | $1.93 | 221.1 | 10.62 | ||
![]() | o1-mini | 128k | 54 | $2.12 | 205.5 | 13.87 | |
![]() DeepSeek R1 Distill Qwen 32B | 128k | 52 | $0.14 | 47.4 | 0.27 | ||
![]() | ![]() DeepSeek R1 Distill Qwen 32B | 64k | 52 | $0.30 | 20.4 | 1.13 | |
![]() DeepSeek R1 Distill Qwen 32B | 128k | 52 | $0.69 | 137.8 | 0.39 | ||
![]() DeepSeek V3 (Mar' 25) | 128k | 52 | $1.25 | 29.9 | 0.81 | ||
![]() DeepSeek V3 (Mar' 25) | 128k | 52 | $0.75 | 30.7 | 0.66 | ||
![]() DeepSeek V3 (Mar' 25) | 128k | 52 | $0.00 | 105.1 | 0.76 | ||
![]() DeepSeek V3 (Mar' 25) | 64k | 52 | $0.52 | 30.1 | 0.36 | ||
Gemini 2.0 Pro Experimental (AI Studio) | 2m | 49 | $0.00 | 92.6 | 0.73 | ||
![]() | ![]() DeepSeek R1 Distill Qwen 14B | 64k | 49 | $0.15 | 45.0 | 0.72 | |
![]() DeepSeek R1 Distill Qwen 14B | 128k | 49 | $1.60 | 165.7 | 0.28 | ||
![]() | ![]() DeepSeek R1 Distill Llama 70B | 66k | 48 | $0.94 | 1,759.1 | 0.19 | |
![]() DeepSeek R1 Distill Llama 70B Base | 128k | 48 | $0.38 | 39.1 | 0.62 | ||
![]() DeepSeek R1 Distill Llama 70B | 128k | 48 | $0.34 | 35.9 | 0.54 | ||
![]() | ![]() DeepSeek R1 Distill Llama 70B | 32k | 48 | $0.39 | 21.4 | 1.26 | |
![]() DeepSeek R1 Distill Llama 70B | 128k | 48 | $0.81 | 275.4 | 0.34 | ||
![]() DeepSeek R1 Distill Llama 70B (Spec decoding) | 128k | 48 | $0.81 | 1,361.7 | 0.41 | ||
![]() | ![]() DeepSeek R1 Distill Llama 70B | 16k | 48 | $0.88 | 123.8 | 1.03 | |
![]() DeepSeek R1 Distill Llama 70B | 128k | 48 | $2.00 | 101.0 | 0.39 | ||
![]() | Claude 3.7 Sonnet | 200k | 48 | $6.00 | 39.1 | 0.77 | |
Claude 3.7 Sonnet | 200k | 48 | $6.00 | 79.2 | 1.13 | ||
Gemini 2.0 Flash Vertex | 1m | 48 | $0.26 | 251.0 | 0.29 | ||
Gemini 2.0 Flash (AI Studio) | 1m | 48 | $0.17 | 253.8 | 0.30 | ||
![]() | ![]() Reka Flash 3 | 128k | 47 | $0.35 | 56.3 | 0.95 | |
![]() | ![]() DeepSeek V3 | 66k | 46 | $0.48 | 27.4 | 4.88 | |
![]() DeepSeek V3 (FP8) | 128k | 46 | $0.25 | 29.5 | 1.10 | ||
![]() DeepSeek V3 | 128k | 46 | $0.75 | 15.9 | 0.68 | ||
![]() | ![]() DeepSeek V3 | 128k | 46 | $2.00 | 83.7 | 0.57 | |
![]() DeepSeek V3 | 128k | 46 | $1.31 | 78.7 | 0.80 | ||
![]() DeepSeek V3 | 64k | 46 | $0.59 | 22.1 | 0.39 | ||
![]() | ![]() DeepSeek V3 Turbo | 64k | 46 | $0.63 | 29.8 | 0.82 | |
![]() | ![]() DeepSeek V3 | 64k | 46 | $0.89 | 27.1 | 0.91 | |
![]() DeepSeek V3 (FP8) | 128k | 46 | $1.25 | 33.9 | 0.67 | ||
Qwen2.5 Max | 32k | 45 | $2.80 | 34.5 | 1.14 | ||
Gemini 1.5 Pro (Sep) (Vertex) | 2m | 45 | $2.19 | 91.6 | 0.51 | ||
Gemini 1.5 Pro (Sep) (AI Studio) | 2m | 45 | $2.19 | 90.8 | 0.43 | ||
![]() | Claude 3.5 Sonnet (Oct) | 200k | 44 | $6.00 | 50.1 | 1.35 | |
Claude 3.5 Sonnet (Oct) Vertex | 200k | 44 | $6.00 | 79.5 | 0.89 | ||
Claude 3.5 Sonnet (Oct) | 200k | 44 | $6.00 | 80.2 | 1.29 | ||
![]() Sonar | 127k | 43 | $1.00 | 71.2 | 1.79 | ||
![]() Sonar Pro | 200k | 43 | $6.00 | 80.0 | 3.16 | ||
QwQ 32B-Preview | 33k | 43 | $0.20 | 60.5 | 1.07 | ||
QwQ 32B-Preview | 33k | 43 | $0.14 | 59.6 | 0.63 | ||
QwQ 32B-Preview | 33k | 43 | $0.90 | 58.5 | 0.43 | ||
QwQ 32B-Preview | 33k | 43 | $0.26 | 47.7 | 0.33 | ||
![]() | QwQ 32B-Preview | 16k | 43 | $1.88 | 363.1 | 0.76 | |
QwQ 32B-Preview | 33k | 43 | $1.20 | 64.5 | 0.58 | ||
GPT-4o (Nov '24) | 128k | 41 | $4.38 | 118.7 | 0.40 | ||
![]() | GPT-4o (Nov '24) | 128k | 41 | $4.38 | 114.8 | 0.99 | |
Gemini 2.0 Flash-Lite (Feb '25) (AI Studio) | 1m | 41 | $0.13 | 172.2 | 0.25 | ||
Llama 3.3 70B (FP8) | 128k | 41 | $0.17 | 35.4 | 0.61 | ||
![]() | Llama 3.3 70B | 33k | 41 | $0.94 | 2,516.4 | 0.18 | |
Llama 3.3 70B | 128k | 41 | $0.40 | 36.7 | 1.41 | ||
![]() | Llama 3.3 70B | 128k | 41 | $0.71 | 137.3 | 0.58 | |
Llama 3.3 70B Fast | 128k | 41 | $0.38 | 135.2 | 0.56 | ||
Llama 3.3 70B Base | 128k | 41 | $0.20 | 18.5 | 0.70 | ||
![]() | Llama 3.3 70B | 128k | 41 | $0.50 | 132.8 | 0.51 | |
![]() | Llama 3.3 70B | 128k | 41 | $0.71 | 49.8 | 0.45 | |
Llama 3.3 70B | 128k | 41 | $0.90 | 130.2 | 0.56 | ||
Llama 3.3 70B (Turbo, FP8) | 128k | 41 | $0.20 | 39.3 | 0.44 | ||
Llama 3.3 70B | 128k | 41 | $0.27 | 26.5 | 0.48 | ||
Llama 3.3 70B | 128k | 41 | $0.60 | 179.3 | 0.33 | ||
![]() | Llama 3.3 70B | 128k | 41 | $0.39 | 35.0 | 0.88 | |
Llama 3.3 70B (Spec decoding) | 8k | 41 | $0.69 | 1,604.9 | 0.42 | ||
Llama 3.3 70B | 128k | 41 | $0.64 | 275.5 | 0.36 | ||
![]() | Llama 3.3 70B | 128k | 41 | $0.75 | 448.9 | 0.35 | |
Llama 3.3 70B Turbo | 128k | 41 | $0.88 | 144.0 | 0.67 | ||
![]() | Llama 3.3 70B | 128k | 41 | $0.70 | 8.1 | 1.13 | |
GPT-4o (May '24) | 128k | 41 | $7.50 | 159.2 | 0.42 | ||
![]() | GPT-4o (May '24) | 128k | 41 | $7.50 | 86.3 | 0.84 | |
Llama 3.1 405B (FP8) | 128k | 40 | $0.80 | 34.9 | 0.65 | ||
Llama 3.1 405B | 128k | 40 | $9.50 | 19.1 | 0.97 | ||
Llama 3.1 405B | 128k | 40 | $4.00 | 11.9 | 0.75 | ||
![]() | Llama 3.1 405B Standard | 128k | 40 | $2.40 | 30.4 | 1.84 | |
![]() | Llama 3.1 405B Latency Optimized | 128k | 40 | $3.00 | 61.7 | 0.74 | |
Llama 3.1 405B Base | 128k | 40 | $1.50 | 31.3 | 0.68 | ||
Llama 3.1 405B Vertex | 128k | 40 | $7.75 | 30.0 | 0.41 | ||
![]() | Llama 3.1 405B | 128k | 40 | $8.00 | 31.2 | 0.48 | |
Llama 3.1 405B | 128k | 40 | $3.00 | 81.8 | 0.66 | ||
Llama 3.1 405B | 33k | 40 | $0.90 | 24.4 | 0.42 | ||
![]() | Llama 3.1 405B | 16k | 40 | $6.25 | 173.7 | 1.25 | |
Llama 3.1 405B | 128k | 40 | $7.50 | 37.7 | 0.75 | ||
Llama 3.1 405B Turbo | 128k | 40 | $3.50 | 104.5 | 0.59 | ||
![]() | Llama 3.1 405B | 128k | 40 | $3.50 | 6.3 | 1.08 | |
Qwen2.5 72B | 131k | 40 | $0.40 | 20.3 | 1.55 | ||
Qwen2.5 72B | 131k | 40 | $0.20 | 19.2 | 0.75 | ||
Qwen2.5 72B Fast | 131k | 40 | $0.38 | 70.5 | 0.55 | ||
Qwen2.5 72B | 131k | 40 | $0.90 | 40.9 | 0.42 | ||
Qwen2.5 72B | 33k | 40 | $0.27 | 38.3 | 0.57 | ||
![]() | Qwen2.5 72B | 16k | 40 | $0.94 | 234.8 | 0.82 | |
Qwen2.5 72B Turbo | 131k | 40 | $1.20 | 92.5 | 0.43 | ||
Qwen2.5 72B | 131k | 40 | $0.00 | 61.0 | 1.06 | ||
![]() | ![]() MiniMax-Text-01 | 1m | 40 | $0.42 | 32.2 | 0.88 | |
Phi-4 | 16k | 40 | $0.15 | 118.2 | 0.49 | ||
![]() | Phi-4 | 16k | 40 | $0.22 | 42.4 | 0.45 | |
Phi-4 | 16k | 40 | $0.09 | 40.5 | 0.53 | ||
![]() Command A | 256k | 40 | $4.38 | 180.6 | 0.24 | ||
![]() | ![]() Tulu3 405B | 16k | 40 | $6.25 | 176.3 | 1.28 | |
![]() | ![]() Mistral Large 2 (Nov '24) | 128k | 38 | $3.00 | 28.2 | 0.56 | |
![]() | ![]() Mistral Large 2 (Nov '24) | 128k | 38 | $3.00 | 36.6 | 0.54 | |
Gemma 3 27B (AI_Studio) | 128k | 38 | $0.00 | 24.6 | 0.70 | ||
Gemma 3 27B | 128k | 38 | $0.07 | 57.9 | 0.56 | ||
Grok Beta | 128k | 38 | $7.50 | 62.5 | 0.27 | ||
![]() | ![]() Pixtral Large | 128k | 37 | $3.00 | 30.9 | 0.45 | |
Qwen2.5 Instruct 32B Fast | 128k | 37 | $0.20 | 85.5 | 0.52 | ||
Qwen2.5 Instruct 32B Base | 128k | 37 | $0.10 | 60.4 | 0.56 | ||
Qwen2.5 Instruct 32B | 128k | 37 | $0.79 | 198.0 | 0.22 | ||
Llama 3.1 Nemotron 70B (FP8) | 128k | 37 | $0.17 | 35.3 | 0.64 | ||
Llama 3.1 Nemotron 70B Base | 128k | 37 | $0.20 | 37.1 | 0.66 | ||
Llama 3.1 Nemotron 70B Fast | 128k | 37 | $0.38 | 73.2 | 0.56 | ||
Llama 3.1 Nemotron 70B | 128k | 37 | $0.27 | 29.3 | 0.54 | ||
![]() | ![]() Nova Pro | 300k | 37 | $1.40 | 94.0 | 0.35 | |
![]() | ![]() Nova Pro Latency Optimized | 300k | 37 | $1.75 | 124.3 | 0.64 | |
![]() | ![]() Mistral Large 2 (Jul '24) | 128k | 37 | $3.00 | 38.9 | 0.58 | |
![]() | ![]() Mistral Large 2 (Jul '24) | 128k | 37 | $3.00 | 34.0 | 0.44 | |
![]() | ![]() Mistral Large 2 (Jul '24) | 128k | 37 | $3.00 | 35.6 | 0.52 | |
Qwen2.5 Coder 32B | 33k | 36 | $0.09 | 64.2 | 0.54 | ||
Qwen2.5 Coder 32B | 131k | 36 | $0.20 | 49.0 | 0.74 | ||
Qwen2.5 Coder 32B | 33k | 36 | $0.90 | 58.7 | 0.37 | ||
Qwen2.5 Coder 32B | 33k | 36 | $0.10 | 48.3 | 0.29 | ||
Qwen2.5 Coder 32B | 131k | 36 | $0.79 | 197.4 | 0.36 | ||
![]() | Qwen2.5 Coder 32B | 16k | 36 | $0.63 | 331.8 | 0.76 | |
Qwen2.5 Coder 32B | 131k | 36 | $0.80 | 73.4 | 0.50 | ||
GPT-4o mini | 128k | 36 | $0.26 | 70.1 | 0.37 | ||
![]() | GPT-4o mini | 128k | 36 | $0.26 | 155.7 | 0.99 | |
Llama 3.1 70B (FP8) | 128k | 35 | $0.17 | 35.9 | 0.56 | ||
Llama 3.1 70B | 128k | 35 | $0.40 | 42.0 | 1.12 | ||
![]() | Llama 3.1 70B Standard | 128k | 35 | $0.72 | 31.5 | 0.65 | |
![]() | Llama 3.1 70B Latency Optimized | 128k | 35 | $0.90 | 141.7 | 0.32 | |
Llama 3.1 70B Base | 128k | 35 | $0.20 | 30.5 | 0.68 | ||
Llama 3.1 70B Fast | 128k | 35 | $0.38 | 145.4 | 0.56 | ||
Llama 3.1 70B Vertex | 128k | 35 | $0.00 | 73.0 | 0.27 | ||
![]() | Llama 3.1 70B | 128k | 35 | $2.90 | 58.2 | 0.44 | |
Llama 3.1 70B | 128k | 35 | $0.90 | 144.9 | 0.42 | ||
Llama 3.1 70B (Turbo, FP8) | 128k | 35 | $0.20 | 28.6 | 0.32 | ||
Llama 3.1 70B | 128k | 35 | $0.27 | 32.6 | 0.61 | ||
Llama 3.1 70B | 128k | 35 | $0.60 | 196.8 | 0.33 | ||
![]() | Llama 3.1 70B | 32k | 35 | $0.35 | 45.7 | 1.11 | |
![]() | Llama 3.1 70B | 128k | 35 | $0.75 | 447.2 | 0.30 | |
Llama 3.1 70B | 128k | 35 | $1.50 | 68.2 | 0.42 | ||
Llama 3.1 70B Turbo | 128k | 35 | $0.88 | 211.7 | 0.32 | ||
Llama 3.1 70B | 128k | 35 | $0.90 | 124.9 | 0.51 | ||
![]() | ![]() Mistral Small 3.1 | 128k | 35 | $0.15 | 155.9 | 0.37 | |
![]() Mistral Small 3.1 Vertex | 128k | 35 | $0.15 | 203.0 | 0.16 | ||
![]() | ![]() Mistral Small 3 | 32k | 35 | $0.15 | 127.7 | 0.37 | |
![]() Mistral Small 3 | 32k | 35 | $0.90 | 38.7 | 0.52 | ||
![]() Mistral Small 3 | 32k | 35 | $0.09 | 65.0 | 0.24 | ||
![]() Mistral Small 3 | 32k | 35 | $0.80 | 95.5 | 0.24 | ||
![]() | Claude 3 Opus | 200k | 35 | $30.00 | 22.1 | 1.27 | |
Claude 3 Opus Vertex | 200k | 35 | $30.00 | 27.7 | 1.12 | ||
Claude 3 Opus | 200k | 35 | $30.00 | 28.1 | 1.37 | ||
![]() | Claude 3.5 Haiku Standard | 200k | 35 | $1.60 | 49.3 | 1.03 | |
![]() | Claude 3.5 Haiku Latency Optimized | 200k | 35 | $2.00 | 94.6 | 0.50 | |
Claude 3.5 Haiku Vertex | 200k | 35 | $1.60 | 64.9 | 0.63 | ||
Claude 3.5 Haiku | 200k | 35 | $1.60 | 64.9 | 2.02 | ||
![]() | ![]() DeepSeek R1 Distill Llama 8B | 32k | 34 | $0.04 | 56.0 | 0.64 | |
Gemini 1.5 Pro (May) (Vertex) | 2m | 34 | $2.19 | 66.7 | 0.41 | ||
Gemini 1.5 Pro (May) (AI Studio) | 2m | 34 | $2.19 | 66.6 | 0.43 | ||
Qwen Turbo | 1m | 34 | $0.09 | 102.2 | 1.05 | ||
![]() | Llama 3.2 90B (Vision) | 128k | 33 | $0.72 | 58.9 | 0.36 | |
Llama 3.2 90B (Vision) Vertex | 128k | 33 | $0.00 | 32.2 | 0.20 | ||
Llama 3.2 90B (Vision) | 128k | 33 | $0.90 | 39.4 | 0.41 | ||
Llama 3.2 90B (Vision) | 33k | 33 | $0.36 | 32.8 | 0.56 | ||
Llama 3.2 90B (Vision) | 8k | 33 | $0.90 | 263.9 | 0.32 | ||
Llama 3.2 90B (Vision) Turbo | 128k | 33 | $1.20 | 31.4 | 0.33 | ||
Qwen2 72B | 33k | 33 | $0.90 | 66.9 | 0.32 | ||
![]() | ![]() Nova Lite | 300k | 33 | $0.10 | 290.4 | 0.32 | |
Gemini 1.5 Flash-8B AI Studio | 1m | 31 | $0.07 | 275.8 | 0.22 | ||
![]() Jamba 1.5 Large | 256k | 29 | $3.50 | 60.6 | 0.64 | ||
![]() | ![]() Jamba 1.5 Large | 256k | 29 | $3.50 | 50.7 | 0.70 | |
![]() Jamba 1.6 Large | 256k | 29 | $3.50 | 64.1 | 0.65 | ||
Gemini 1.5 Flash (May) (Vertex) | 1m | 28 | $0.13 | 302.3 | 0.30 | ||
Gemini 1.5 Flash (May) (AI Studio) | 1m | 28 | $0.13 | 298.6 | 0.21 | ||
![]() | ![]() Nova Micro | 130k | 28 | $0.06 | 314.8 | 0.31 | |
![]() Yi-Large | 32k | 28 | $3.00 | 65.8 | 0.53 | ||
![]() | Claude 3 Sonnet | 200k | 28 | $6.00 | 56.2 | 0.76 | |
Claude 3 Sonnet | 200k | 28 | $6.00 | 57.5 | 0.53 | ||
![]() | ![]() Codestral (Jan '25) | 256k | 28 | $0.45 | 206.8 | 0.30 | |
![]() Codestral (Jan '25) Vertex | 128k | 28 | $0.45 | 153.3 | 0.15 | ||
Llama 3 70B | 8k | 27 | $1.18 | 48.4 | 0.38 | ||
Llama 3 70B | 8k | 27 | $0.40 | 32.4 | 0.85 | ||
![]() | Llama 3 70B | 8k | 27 | $2.86 | 53.7 | 0.43 | |
![]() | Llama 3 70B | 8k | 27 | $2.90 | 18.9 | 0.76 | |
Llama 3 70B | 8k | 27 | $0.90 | 142.4 | 0.37 | ||
Llama 3 70B | 8k | 27 | $0.27 | 50.7 | 0.57 | ||
![]() | Llama 3 70B | 8k | 27 | $0.57 | 31.7 | 0.61 | |
Llama 3 70B | 8k | 27 | $0.64 | 348.2 | 0.29 | ||
Llama 3 70B (Reference, FP16) | 8k | 27 | $0.90 | 157.5 | 0.50 | ||
Llama 3 70B (Turbo, FP8) | 8k | 27 | $0.88 | 26.4 | 0.32 | ||
![]() | ![]() Mistral Small (Sep '24) | 33k | 27 | $0.30 | 65.7 | 0.38 | |
![]() | Phi-4 Multimodal | 128k | 27 | $0.00 | 25.8 | 0.33 | |
Qwen2.5 Coder 7B Fast | 131k | 27 | $0.04 | 220.0 | 0.51 | ||
Qwen2.5 Coder 7B Base | 131k | 27 | $0.01 | 193.4 | 0.50 | ||
![]() | ![]() Mistral Large (Feb '24) | 33k | 26 | $6.00 | 31.9 | 0.51 | |
![]() | ![]() Mistral Large (Feb '24) | 33k | 26 | $6.00 | 43.1 | 0.40 | |
![]() | ![]() Mistral Large (Feb '24) | 33k | 26 | $6.00 | 39.8 | 0.51 | |
![]() | ![]() Mixtral 8x22B | 65k | 26 | $3.00 | 67.6 | 0.42 | |
![]() Mixtral 8x22B Base | 65k | 26 | $0.60 | 90.8 | 0.53 | ||
![]() Mixtral 8x22B Fast | 65k | 26 | $1.05 | 107.4 | 0.53 | ||
![]() Mixtral 8x22B | 65k | 26 | $1.20 | 77.4 | 0.41 | ||
![]() Mixtral 8x22B | 65k | 26 | $1.20 | 70.2 | 0.51 | ||
![]() | Phi-4 Mini | 128k | 26 | $0.12 | 199.9 | 0.45 | |
![]() | Phi-4 Mini | 128k | 26 | $0.00 | 49.3 | 0.34 | |
![]() | Phi-3 Medium 14B | 128k | 25 | $0.30 | 52.6 | 0.42 | |
![]() | Claude 2.1 | 200k | 24 | $12.00 | 29.1 | 1.76 | |
Claude 2.1 | 200k | 24 | $12.00 | 13.8 | 0.84 | ||
Llama 3.1 8B | 128k | 24 | $0.03 | 149.6 | 0.40 | ||
![]() | Llama 3.1 8B | 33k | 24 | $0.10 | 2,175.4 | 0.28 | |
Llama 3.1 8B | 128k | 24 | $0.10 | 68.6 | 0.90 | ||
![]() | Llama 3.1 8B | 128k | 24 | $0.22 | 90.3 | 0.37 | |
Llama 3.1 8B Fast | 128k | 24 | $0.04 | 180.0 | 0.52 | ||
Llama 3.1 8B Base | 128k | 24 | $0.03 | 66.4 | 0.55 | ||
Llama 3.1 8B Vertex | 128k | 24 | $0.00 | 119.1 | 0.17 | ||
![]() | Llama 3.1 8B | 128k | 24 | $0.38 | 224.2 | 0.30 | |
Llama 3.1 8B | 128k | 24 | $0.20 | 236.7 | 0.33 | ||
Llama 3.1 8B | 128k | 24 | $0.04 | 54.1 | 0.29 | ||
Llama 3.1 8B | 128k | 24 | $0.10 | 449.8 | 0.31 | ||
![]() | Llama 3.1 8B | 16k | 24 | $0.05 | 66.1 | 0.71 | |
Llama 3.1 8B | 128k | 24 | $0.06 | 751.1 | 0.20 | ||
![]() | Llama 3.1 8B | 16k | 24 | $0.13 | 1,131.6 | 0.23 | |
Llama 3.1 8B Turbo | 128k | 24 | $0.18 | 327.5 | 0.22 | ||
Llama 3.1 8B | 128k | 24 | $0.15 | 453.6 | 0.17 | ||
![]() | Llama 3.1 8B | 128k | 24 | $0.18 | 61.5 | 0.51 | |
![]() | ![]() Pixtral 12B | 128k | 23 | $0.15 | 104.4 | 0.40 | |
![]() Pixtral 12B | 128k | 23 | $0.10 | 69.5 | 0.39 | ||
![]() | ![]() Mistral Small (Feb '24) | 33k | 23 | $1.50 | 148.2 | 0.35 | |
![]() | ![]() Mistral Small (Feb '24) | 33k | 23 | $1.50 | 87.6 | 0.41 | |
![]() | ![]() Mistral Medium | 33k | 23 | $4.09 | 42.5 | 0.46 | |
![]() | ![]() Ministral 8B | 128k | 22 | $0.10 | 144.8 | 0.39 | |
Gemma 2 9B Fast | 8k | 22 | $0.04 | 166.2 | 0.50 | ||
Gemma 2 9B Base | 8k | 22 | $0.03 | 163.2 | 0.49 | ||
Gemma 2 9B | 8k | 22 | $0.04 | 55.0 | 0.33 | ||
Gemma 2 9B | 8k | 22 | $0.20 | 651.5 | 0.21 | ||
Gemma 2 9B | 8k | 22 | $0.30 | 132.0 | 0.26 | ||
![]() LFM 40B | 32k | 22 | $0.15 | 164.7 | 0.47 | ||
![]() | ![]() Command-R+ | 128k | 21 | $6.00 | 47.3 | 0.48 | |
![]() Command-R+ | 128k | 21 | $4.38 | 49.8 | 0.26 | ||
Llama 3 8B | 8k | 21 | $0.10 | 74.4 | 0.39 | ||
![]() | Llama 3 8B | 8k | 21 | $0.38 | 103.7 | 0.30 | |
![]() | Llama 3 8B | 8k | 21 | $0.38 | 73.7 | 0.37 | |
Llama 3 8B | 8k | 21 | $0.20 | 125.6 | 0.29 | ||
Llama 3 8B | 8k | 21 | $0.04 | 108.9 | 0.27 | ||
![]() | Llama 3 8B | 8k | 21 | $0.04 | 46.2 | 0.96 | |
Llama 3 8B | 8k | 21 | $0.06 | 1,200.8 | 0.30 | ||
Llama 3 8B | 8k | 21 | $0.20 | 247.5 | 0.30 | ||
Gemini 1.0 Pro Vertex | 33k | 21 | $0.19 | 159.5 | 0.39 | ||
![]() | ![]() Codestral (May '24) | 33k | 20 | $0.30 | 106.9 | 0.39 | |
![]() Aya Expanse 32B | 128k | 20 | $0.75 | 119.8 | 0.15 | ||
![]() | ![]() Command-R+ (Apr '24) | 128k | 20 | $6.00 | 47.3 | 0.49 | |
![]() Command-R+ (Apr '24) | 128k | 20 | $6.00 | 56.2 | 0.22 | ||
![]() | ![]() Command-R+ (Apr '24) | 128k | 20 | $6.00 | 49.7 | 0.58 | |
![]() DBRX | 33k | 20 | $1.13 | 64.0 | 0.50 | ||
![]() | ![]() Ministral 3B | 128k | 20 | $0.04 | 217.4 | 0.33 | |
![]() | ![]() Mistral NeMo | 128k | 20 | $0.15 | 143.8 | 0.36 | |
![]() Mistral NeMo Fast | 128k | 20 | $0.12 | 159.1 | 0.53 | ||
![]() Mistral NeMo Base | 128k | 20 | $0.06 | 22.1 | 0.70 | ||
![]() Mistral NeMo | 128k | 20 | $0.06 | 53.4 | 0.38 | ||
Llama 3.2 3B (FP8) | 128k | 20 | $0.02 | 222.4 | 0.38 | ||
Llama 3.2 3B | 128k | 20 | $0.10 | 153.4 | 0.85 | ||
![]() | Llama 3.2 3B | 128k | 20 | $0.15 | 70.9 | 0.33 | |
Llama 3.2 3B Base | 128k | 20 | $0.01 | 85.4 | 0.52 | ||
![]() | Llama 3.2 3B | 128k | 20 | $0.06 | 227.6 | 0.42 | |
Llama 3.2 3B | 128k | 20 | $0.10 | 166.7 | 0.23 | ||
Llama 3.2 3B | 128k | 20 | $0.02 | 121.7 | 0.46 | ||
![]() | Llama 3.2 3B | 32k | 20 | $0.04 | 71.9 | 0.64 | |
Llama 3.2 3B | 8k | 20 | $0.06 | 1,537.5 | 0.34 | ||
![]() | Llama 3.2 3B | 8k | 20 | $0.10 | 1,534.4 | 0.22 | |
Llama 3.2 3B Turbo | 128k | 20 | $0.06 | 60.5 | 0.53 | ||
![]() DeepSeek R1 Distill Qwen 1.5B | 128k | 19 | $0.18 | 383.7 | 0.20 | ||
![]() Jamba 1.5 Mini | 256k | 18 | $0.25 | 170.0 | 0.42 | ||
![]() | ![]() Jamba 1.5 Mini | 256k | 18 | $0.25 | 81.9 | 0.48 | |
![]() Jamba 1.6 Mini | 256k | 18 | $0.25 | 195.3 | 0.41 | ||
![]() | ![]() Mixtral 8x7B | 33k | 17 | $0.70 | 90.2 | 0.34 | |
![]() | ![]() Mixtral 8x7B | 33k | 17 | $0.51 | 86.6 | 0.32 | |
![]() Mixtral 8x7B Fast | 33k | 17 | $0.23 | 158.7 | 0.52 | ||
![]() Mixtral 8x7B Base | 33k | 17 | $0.12 | 53.8 | 0.60 | ||
![]() Mixtral 8x7B | 33k | 17 | $0.50 | 180.0 | 0.30 | ||
![]() Mixtral 8x7B | 33k | 17 | $0.24 | 97.6 | 0.50 | ||
![]() Mixtral 8x7B | 33k | 17 | $0.63 | 94.4 | 0.37 | ||
![]() Mixtral 8x7B | 33k | 17 | $0.60 | 93.0 | 0.30 | ||
![]() Aya Expanse 8B | 8k | 16 | $0.75 | 164.3 | 0.12 | ||
![]() | ![]() Command-R | 128k | 15 | $0.75 | 108.4 | 0.34 | |
![]() Command-R | 128k | 15 | $0.26 | 54.4 | 0.20 | ||
![]() | ![]() Command-R (Mar '24) | 128k | 15 | $0.75 | 108.3 | 0.34 | |
![]() Command-R (Mar '24) | 128k | 15 | $0.75 | 117.9 | 0.16 | ||
![]() | ![]() Command-R (Mar '24) | 128k | 15 | $0.75 | 80.9 | 0.44 | |
![]() | ![]() Codestral-Mamba | 256k | 14 | $0.25 | 94.8 | 0.57 | |
![]() | ![]() Mistral 7B | 8k | 10 | $0.25 | 113.5 | 0.33 | |
![]() | ![]() Mistral 7B | 8k | 10 | $0.16 | 93.3 | 0.32 | |
![]() Mistral 7B | 8k | 10 | $0.04 | 84.5 | 0.20 | ||
![]() | ![]() Mistral 7B | 32k | 10 | $0.06 | 118.6 | 0.79 | |
![]() Mistral 7B | 8k | 10 | $0.20 | 177.5 | 0.18 | ||
![]() | Llama 3.2 1B | 128k | 10 | $0.10 | 118.3 | 0.31 | |
Llama 3.2 1B Base | 128k | 10 | $0.01 | 265.7 | 0.49 | ||
Llama 3.2 1B | 128k | 10 | $0.01 | 126.0 | 0.32 | ||
Llama 3.2 1B | 8k | 10 | $0.04 | 3,408.7 | 0.46 | ||
![]() | Llama 3.2 1B | 16k | 10 | $0.05 | 2,475.1 | 0.20 | |
Llama 2 Chat 7B | 4k | 8 | $0.10 | 121.7 | 0.45 | ||
o1-preview | 128k | $26.25 | 143.7 | 20.64 | |||
![]() | o1-preview | 128k | $28.88 | 124.3 | 31.41 | ||
GPT-4o (Aug '24) | 128k | $4.38 | 157.5 | 0.40 | |||
![]() | GPT-4o (Aug '24) | 128k | $4.38 | 115.5 | 0.73 | ||
GPT-4o (ChatGPT) | 128k | $7.50 | 124.5 | 0.41 | |||
GPT-4.5 (Preview) | 128k | $93.75 | 10.4 | 1.63 | |||
![]() | Llama 3.2 11B (Vision) | 128k | $0.16 | 144.0 | 0.34 | ||
![]() | Llama 3.2 11B (Vision) | 128k | $0.15 | 84.8 | 0.44 | ||
Llama 3.2 11B (Vision) | 128k | $0.20 | 106.1 | 0.28 | |||
Llama 3.2 11B (Vision) | 128k | $0.06 | 56.0 | 0.28 | |||
Llama 3.2 11B (Vision) | 8k | $0.18 | 750.2 | 0.18 | |||
Llama 3.2 11B (Vision) Turbo | 128k | $0.18 | 155.0 | 0.25 | |||
Gemini 2.0 Flash (exp) (AI Studio) | 1m | $0.00 | 252.8 | 0.24 | |||
Gemini 1.5 Flash (Sep) (Vertex) | 1m | $0.13 | 189.4 | 0.23 | |||
Gemini 1.5 Flash (Sep) (AI Studio) | 1m | $0.13 | 182.5 | 0.29 | |||
Gemma 2 27B Fast | 8k | $0.26 | 85.1 | 0.52 | |||
Gemma 2 27B Base | 8k | $0.15 | 52.0 | 0.58 | |||
Gemma 2 27B | 8k | $0.80 | 91.3 | 0.23 | |||
![]() | Claude 3.5 Sonnet (June) | 200k | $6.00 | 46.2 | 0.88 | ||
Claude 3.5 Sonnet (June) Vertex | 200k | $6.00 | 79.1 | 0.87 | |||
Claude 3.5 Sonnet (June) | 200k | $6.00 | 79.3 | 0.90 | |||
![]() | Claude 3 Haiku | 200k | $0.50 | 108.7 | 0.50 | ||
Claude 3 Haiku | 200k | $0.50 | 138.6 | 0.61 | |||
![]() | ![]() Mistral Saba | 32k | $0.30 | 99.2 | 0.37 | ||
![]() Mistral Saba | 32k | $0.79 | 384.8 | 0.34 | |||
![]() DeepSeek Coder V2 Lite Fast, FP8 | 128k | $0.12 | 111.0 | 0.61 | |||
![]() DeepSeek Coder V2 Lite Base, FP8 | 128k | $0.06 | 109.0 | 0.57 | |||
![]() Sonar Reasoning | 127k | $2.00 | 91.4 | 2.20 | |||
![]() | ![]() Solar Mini | 4k | $0.15 | 47.9 | 1.08 | ||
Qwen1.5 Chat 110B | 32k | $0.00 | 29.5 | 1.18 | |||
GPT-4 Turbo | 128k | $15.00 | 43.9 | 0.71 | |||
![]() | GPT-4 Turbo | 128k | $15.00 | 50.5 | 1.48 | ||
GPT-4 | 8k | $37.50 | 27.4 | 0.78 | |||
Gemini 2.0 Flash-Lite (Preview) (AI Studio) | 1m | $0.13 | 182.7 | 0.25 | |||
Claude 2.0 | 100k | $12.00 | 31.0 | 0.87 | |||
![]() OpenChat 3.5 | 8k | $0.06 | 82.1 | 0.30 | |||
![]() Jamba Instruct | 256k | $0.55 | 163.4 | 0.45 |