LLM API Providers Leaderboard - Comparison of over 500 AI Model endpoints
Comparison of API provider performance across over 500 AI Model endpoints, including from OpenAI, Google, DeepSeek and others, across performance key metrics including price, output speed, latency, context window and more. For more details including relating to our methodology, see our FAQs.
Features | Model Intelligence | Price | Speed | Latency | End-to-End Response Time | |||||
|---|---|---|---|---|---|---|---|---|---|---|
Further Analysis | ||||||||||
OpenAI | GPT-5.2 (xhigh) | 400k | Proprietary | 51 | $4.81 | 100 | 27.93 | 32.91 | N/A | |
Microsoft Azure | GPT-5.2 (xhigh) | 400k | Proprietary | 51 | $4.81 | 107 | 50.89 | 55.55 | N/A | |
![]() Databricks | GPT-5.2 (xhigh) | 400k | Proprietary | 51 | $3.44 | 133 | 34.68 | 38.45 | N/A | |
Google Vertex | Claude Opus 4.5 Vertex | 200k | Proprietary | 49 | $10.00 | 58 | 0.92 | 43.70 | 34.22 | |
Anthropic | Claude Opus 4.5 | 200k | Proprietary | 49 | $10.00 | 82 | 1.70 | 32.30 | 24.47 | |
Amazon Bedrock | Claude Opus 4.5 | 200k | Proprietary | 49 | $10.00 | 99 | 1.68 | 26.95 | 20.22 | |
Google (Vertex) | Gemini 3 Pro Preview (high) (Vertex) | 1m | Proprietary | 48 | $4.50 | 142 | 31.67 | 35.21 | N/A | |
Google (AI Studio) | Gemini 3 Pro Preview (high) (AI Studio) | 1m | Proprietary | 48 | $4.50 | 128 | 32.19 | 36.08 | N/A | |
![]() Databricks | Gemini 3 Pro Preview (high) | 1m | Proprietary | 48 | $3.44 | 131 | 4.45 | 8.27 | N/A | |
![]() Databricks | GPT-5.1 (high) | 400k | Proprietary | 47 | $3.44 | 128 | 32.27 | 36.17 | N/A | |
OpenAI | GPT-5.1 (high) | 400k | Proprietary | 47 | $3.44 | 126 | 26.40 | 30.39 | N/A | |
Microsoft Azure | GPT-5.1 (high) | 400k | Proprietary | 47 | $3.44 | 264 | 16.71 | 18.60 | N/A | |
Google (AI Studio) | Gemini 3 Flash (AI Studio) | 1m | Proprietary | 46 | $1.13 | 223 | 11.62 | 13.86 | N/A | |
OpenAI | GPT-5.2 (medium) | 400k | Proprietary | 45 | $4.81 | 79 | 0.56 | 6.87 | N/A | |
![]() Databricks | Claude Opus 4.5 | 200k | Proprietary | 43 | $10.00 | 86 | 2.15 | 7.95 | N/A | |
Anthropic | Claude Opus 4.5 | 200k | Proprietary | 43 | $10.00 | 68 | 1.77 | 9.13 | N/A | |
Google Vertex | Claude Opus 4.5 Vertex | 200k | Proprietary | 43 | $10.00 | 46 | 1.19 | 12.13 | N/A | |
Amazon Bedrock | Claude Opus 4.5 | 200k | Proprietary | 43 | $10.00 | 81 | 1.75 | 7.90 | N/A | |
Amazon Bedrock | Claude 4.5 Sonnet | 1m | Proprietary | 42 | $6.00 | 102 | 1.74 | 26.15 | 19.53 | |
Anthropic | Claude 4.5 Sonnet | 1m | Proprietary | 42 | $6.00 | 80 | 1.63 | 32.91 | 25.02 | |
Google Vertex | Claude 4.5 Sonnet Vertex | 1m | Proprietary | 42 | $6.00 | 66 | 0.87 | 38.56 | 30.16 | |
GMI (FP8) | GLM-4.7 (FP8) | 203k | Open | 42 | $0.80 | 44 | 1.40 | 57.65 | 45.00 | |
Baseten | GLM-4.7 | 200k | Open | 42 | $1.00 | 386 | 0.67 | 7.16 | 5.19 | |
Novita | GLM-4.7 | 205k | Open | 42 | $1.00 | 93 | 0.72 | 27.53 | 21.45 | |
Fireworks | GLM-4.7 | 203k | Open | 42 | $1.00 | 422 | 0.46 | 6.38 | 4.74 | |
SiliconFlow | GLM-4.7 | 200k | Open | 42 | $0.88 | 70 | 1.25 | 36.83 | 28.47 | |
![]() DeepInfra (FP4) | GLM-4.7 (FP4) | 203k | Open | 42 | $0.76 | 54 | 0.53 | 46.77 | 36.99 | |
Parasail (FP8) | GLM-4.7 (FP8) | 131k | Open | 42 | $0.86 | 59 | 0.49 | 42.97 | 33.98 | |
![]() Cerebras | GLM-4.7 | 131k | Open | 42 | $2.38 | 1,789 | 0.24 | 1.64 | 1.12 | |
OpenAI | GPT-5.1 Codex (high) | 400k | Proprietary | 42 | $3.44 | 200 | 11.93 | 14.44 | N/A | |
Microsoft Azure | GPT-5.1 Codex (high) | 400k | Proprietary | 42 | $3.44 | 94 | 19.47 | 24.81 | N/A | |
xAI | Grok 4 | 256k | Proprietary | 41 | $6.00 | 36 | 6.82 | 20.82 | N/A | |
Microsoft Azure | Grok 4 | 256k | Proprietary | 41 | $11.00 | 26 | 12.22 | 31.70 | N/A | |
SiliconFlow (FP8) | DeepSeek V3.2 (FP8) | 164k | Open | 41 | $0.31 | 43 | 2.06 | 59.89 | 46.27 | |
DeepSeek | DeepSeek V3.2 | 128k | Open | 41 | $0.32 | 31 | 1.41 | 82.14 | 64.59 | |
Baseten | DeepSeek V3.2 | 131k | Open | 41 | $0.34 | 62 | 3.58 | 43.62 | 32.02 | |
Google Vertex | DeepSeek V3.2 Vertex | 164k | Open | 41 | $0.84 | 51 | 0.52 | 49.13 | 38.89 | |
Novita | DeepSeek V3.2 | 164k | Open | 41 | $0.30 | 32 | 1.20 | 79.65 | 62.76 | |
Parasail (FP8) | DeepSeek V3.2 (FP8) | 164k | Open | 41 | $0.32 | 8 | 1.15 | 318.85 | 254.16 | |
Fireworks | DeepSeek V3.2 | 164k | Open | 41 | $0.84 | 198 | 2.26 | 14.87 | 10.09 | |
![]() Databricks | GPT-5 mini (high) | 400k | Proprietary | 41 | $0.69 | 75 | 122.60 | 129.31 | N/A | |
OpenAI | GPT-5 mini (high) | 400k | Proprietary | 41 | $0.69 | 72 | 113.01 | 119.93 | N/A | |
Microsoft Azure | GPT-5 mini (high) | 400k | Proprietary | 41 | $0.69 | 126 | 65.24 | 69.21 | N/A | |
Google (AI Studio) | Gemini 3 Pro Preview (low) (AI Studio) | 1m | Proprietary | 41 | $4.50 | 121 | 3.30 | 7.44 | N/A | |
Google (Vertex) | Gemini 3 Pro Preview (low) (Vertex) | 1m | Proprietary | 41 | $4.50 | 133 | 4.11 | 7.87 | N/A | |
![]() Kimi Turbo | ![]() Kimi K2 Thinking Turbo | 262k | Open | 40 | $2.86 | 82 | 1.64 | 32.31 | 24.54 | |
![]() Kimi | ![]() Kimi K2 Thinking | 262k | Open | 40 | $1.07 | 18 | 1.93 | 140.59 | 110.93 | |
Google Vertex | ![]() Kimi K2 Thinking Vertex | 262k | Open | 40 | $1.07 | 224 | 0.30 | 11.48 | 8.94 | |
Amazon Bedrock | ![]() Kimi K2 Thinking | 256k | Open | 40 | $1.07 | 84 | 0.61 | 30.52 | 23.93 | |
Together.ai | ![]() Kimi K2 Thinking | 262k | Open | 40 | $1.90 | 62 | 0.51 | 41.05 | 32.44 | |
Fireworks | ![]() Kimi K2 Thinking | 262k | Open | 40 | $1.07 | 224 | 0.46 | 11.60 | 8.91 | |
Baseten | ![]() Kimi K2 Thinking | 262k | Open | 40 | $1.07 | 138 | 0.69 | 18.86 | 14.54 | |
Parasail | ![]() Kimi K2 Thinking | 262k | Open | 40 | $0.94 | 59 | 0.51 | 42.92 | 33.93 | |
Nebius (FP8) | ![]() Kimi K2 Thinking (FP8) | 262k | Open | 40 | $1.07 | 77 | 0.59 | 33.20 | 26.09 | |
![]() DeepInfra | ![]() Kimi K2 Thinking | 131k | Open | 40 | $0.85 | 79 | 0.44 | 32.16 | 25.37 | |
GMI | ![]() Kimi K2 Thinking | 262k | Open | 40 | $0.36 | 97 | 1.11 | 26.98 | 20.70 | |
Novita | ![]() Kimi K2 Thinking | 262k | Open | 40 | $1.07 | 28 | 0.88 | 91.30 | 72.34 | |
MiniMax | MiniMax-M2.1 | 205k | Open | 39 | $0.53 | 71 | 1.59 | 36.94 | 28.28 | |
Fireworks | MiniMax-M2.1 | 205k | Open | 39 | $0.53 | 212 | 0.45 | 12.22 | 9.42 | |
Novita | MiniMax-M2.1 | 205k | Open | 39 | $0.53 | 68 | 1.19 | 38.11 | 29.54 | |
GMI (FP8) | MiniMax-M2.1 (FP8) | 197k | Open | 39 | $0.21 | 96 | 1.29 | 27.39 | 20.88 | |
![]() DeepInfra (FP8) | MiniMax-M2.1 (FP8) | 197k | Open | 39 | $0.51 | 86 | 0.28 | 29.24 | 23.17 | |
Xiaomi | MiMo-V2-Flash | 256k | Open | 39 | $0.15 | 108 | 1.69 | 24.85 | 18.53 | |
xAI | Grok 4.1 Fast | 2m | Proprietary | 38 | $0.28 | 147 | 5.84 | 9.24 | N/A | |
OpenAI | GPT-5.1 Codex mini (high) | 400k | Proprietary | 38 | $0.69 | 179 | 8.26 | 11.05 | N/A | |
Microsoft Azure | GPT-5.1 Codex mini (high) | 400k | Proprietary | 38 | $0.69 | 140 | 13.98 | 17.55 | N/A | |
Amazon Bedrock | Claude 4.5 Sonnet | 1m | Proprietary | 37 | $6.00 | 92 | 1.62 | 7.08 | N/A | |
Anthropic | Claude 4.5 Sonnet | 1m | Proprietary | 37 | $6.00 | 73 | 1.51 | 8.39 | N/A | |
Google Vertex | Claude 4.5 Sonnet Vertex | 1m | Proprietary | 37 | $6.00 | 49 | 0.86 | 11.16 | N/A | |
![]() Databricks | Claude 4.5 Sonnet | 1m | Proprietary | 37 | $6.60 | 93 | 2.20 | 7.59 | N/A | |
Amazon Bedrock | Claude 4.5 Haiku | 200k | Proprietary | 37 | $2.00 | 113 | 0.61 | 22.83 | 17.77 | |
Google Vertex | Claude 4.5 Haiku Vertex | 200k | Proprietary | 37 | $2.00 | 130 | 0.55 | 19.74 | 15.35 | |
Anthropic | Claude 4.5 Haiku | 200k | Proprietary | 37 | $2.00 | 106 | 0.48 | 24.06 | 18.87 | |
Novita | KAT-Coder-Pro V1 | 256k | Proprietary | 36 | $0.00 | 63 | 1.09 | 9.03 | N/A | |
![]() DeepInfra | MiniMax-M2 | 262k | Open | 36 | $0.45 | 94 | 0.25 | 26.83 | 21.27 | |
Amazon Bedrock | MiniMax-M2 | 205k | Open | 36 | $0.53 | 72 | 0.62 | 35.17 | 27.64 | |
Novita | MiniMax-M2 | 205k | Open | 36 | $0.53 | 108 | 0.99 | 24.22 | 18.59 | |
Fireworks | MiniMax-M2 | 197k | Open | 36 | $0.53 | 105 | 0.56 | 24.37 | 19.05 | |
Google Vertex | MiniMax-M2 Vertex | 197k | Open | 36 | $0.53 | 176 | 0.25 | 14.47 | 11.38 | |
MiniMax | MiniMax-M2 | 205k | Open | 36 | $0.53 | 108 | 1.28 | 24.44 | 18.53 | |
Amazon Bedrock | Nova 2.0 Pro Preview (medium) | 256k | Proprietary | 35 | $3.44 | 133 | 21.03 | 39.82 | 15.03 | |
Google (AI Studio) | Gemini 3 Flash (AI Studio) | 1m | Proprietary | 35 | $1.13 | 196 | 0.74 | 3.29 | N/A | |
xAI | Grok 4 Fast | 2m | Proprietary | 35 | $0.28 | 166 | 5.06 | 8.08 | N/A | |
Google Vertex | Gemini 2.5 Pro Vertex | 1m | Proprietary | 34 | $3.44 | 139 | 31.90 | 35.49 | N/A | |
Google (AI Studio) | Gemini 2.5 Pro (AI Studio) | 1m | Proprietary | 34 | $3.44 | 160 | 32.90 | 36.01 | N/A | |
GMI (FP8) | GLM-4.7 (FP8) | 203k | Open | 34 | $0.32 | 39 | 1.56 | 14.47 | N/A | |
![]() DeepInfra (FP4) | GLM-4.7 (FP4) | 203k | Open | 34 | $0.76 | 52 | 0.44 | 10.09 | N/A | |
Novita | GLM-4.7 | 205k | Open | 34 | $1.00 | 75 | 0.73 | 7.42 | N/A | |
Baseten | GLM-4.7 | 200k | Open | 34 | $1.00 | 369 | 0.70 | 2.05 | N/A | |
SiliconFlow | GLM-4.7 | 200k | Open | 34 | $0.88 | 66 | 83.81 | 91.33 | N/A | |
Parasail (FP8) | GLM-4.7 (FP8) | 131k | Open | 34 | $0.86 | 50 | 0.46 | 10.55 | N/A | |
Fireworks | GLM-4.7 | 203k | Open | 34 | $1.00 | 360 | 0.45 | 1.84 | N/A | |
![]() Cerebras | GLM-4.7 | 131k | Open | 34 | $2.38 | 991 | 0.25 | 0.76 | N/A | |
![]() SambaNova | DeepSeek V3.1 Terminus | 131k | Open | 33 | $3.38 | 168 | 1.96 | 16.84 | 11.91 | |
Novita (FP8) | DeepSeek V3.1 Terminus (FP8) | 131k | Open | 33 | $0.45 | 41 | 1.34 | 62.10 | 48.61 | |
Eigen AI | DeepSeek V3.1 Terminus | 128k | Open | 33 | $0.80 | 159 | 1.61 | 17.31 | 12.56 | |
OpenAI | GPT-5.2 | 400k | Proprietary | 33 | $4.81 | 75 | 0.53 | 7.24 | N/A | |
Microsoft Azure | GPT-5.2 | 400k | Proprietary | 33 | $4.81 | 55 | 0.75 | 9.76 | N/A | |
OpenAI | GPT-5.2 | 400k | Proprietary | 33 | $4.81 | 80 | 0.53 | 6.78 | N/A | |
![]() Cerebras | gpt-oss-120B (high) | 131k | Open | 33 | $0.45 | 3,110 | 0.28 | 1.09 | 0.64 | |
Parasail | gpt-oss-120B (high) | 131k | Open | 33 | $0.26 | 331 | 0.43 | 7.98 | 6.04 | |
![]() Databricks | gpt-oss-120B (high) | 128k | Open | 33 | $0.26 | 156 | 0.44 | 16.44 | 12.80 | |
Cloudflare | gpt-oss-120B (high) | 128k | Open | 33 | $0.45 | 113 | 86.79 | 108.83 | 17.62 | |
Nebius Base | gpt-oss-120B (high) Base | 128k | Open | 33 | $0.26 | 322 | 0.54 | 8.29 | 6.20 | |
Microsoft Azure | gpt-oss-120B (high) | 131k | Open | 33 | $0.26 | 323 | 0.38 | 8.11 | 6.18 | |
Baseten | gpt-oss-120B (high) | 128k | Open | 33 | $0.20 | 242 | 0.58 | 10.91 | 8.27 | |
![]() SambaNova | gpt-oss-120B (high) | 131k | Open | 33 | $0.31 | 614 | 0.46 | 4.53 | 3.26 | |
Eigen AI | gpt-oss-120B (high) | 131k | Open | 33 | $0.20 | 779 | 1.73 | 4.94 | 2.57 | |
Hyperbolic | gpt-oss-120B (high) | 131k | Open | 33 | $0.30 | 404 | 0.66 | 6.84 | 4.95 | |
Amazon Bedrock | gpt-oss-120B (high) | 131k | Open | 33 | $0.26 | 255 | 0.58 | 10.38 | 7.84 | |
Together.ai | gpt-oss-120B (high) | 131k | Open | 33 | $0.26 | 896 | 0.90 | 3.68 | 2.23 | |
Lightning AI | gpt-oss-120B (high) | 128k | Open | 33 | $0.17 | 766 | 0.20 | 3.46 | 2.61 | |
![]() DeepInfra (Turbo) | gpt-oss-120B (high) (Turbo) | 131k | Open | 33 | $0.26 | 372 | 0.20 | 6.91 | 5.37 | |
Google Vertex | gpt-oss-120B (high) Vertex | 131k | Open | 33 | $0.16 | 364 | 0.26 | 7.13 | 5.50 | |
Snowflake | gpt-oss-120B (high) | 131k | Open | 33 | $0.22 | 302 | 0.56 | 8.85 | 6.63 | |
![]() Groq | gpt-oss-120B (high) | 131k | Open | 33 | $0.26 | 466 | 0.16 | 5.53 | 4.29 | |
Clarifai | gpt-oss-120B (high) | 131k | Open | 33 | $0.16 | 558 | 0.24 | 4.72 | 3.58 | |
![]() DeepInfra | gpt-oss-120B (high) | 131k | Open | 33 | $0.08 | 81 | 0.24 | 31.14 | 24.72 | |
Novita | gpt-oss-120B (high) | 131k | Open | 33 | $0.10 | 88 | 0.44 | 28.78 | 22.68 | |
Fireworks | gpt-oss-120B (high) | 131k | Open | 33 | $0.26 | 782 | 0.33 | 3.52 | 2.56 | |
Alibaba Cloud | Qwen3 Max Thinking | 262k | Proprietary | 32 | $2.40 | 38 | 1.74 | 68.26 | 53.22 | |
xAI Fast | Grok 3 mini Reasoning (high) Fast | 131k | Proprietary | 32 | $1.45 | 194 | 0.72 | 13.58 | 10.29 | |
xAI | Grok 3 mini Reasoning (high) | 131k | Proprietary | 32 | $0.35 | 189 | 0.71 | 13.92 | 10.56 | |
Microsoft Azure | Grok 3 mini Reasoning (high) | 32k | Proprietary | 32 | $0.00 | 138 | 0.34 | 18.42 | 14.46 | |
Amazon Bedrock | Nova 2.0 Pro Preview (low) | 256k | Proprietary | 32 | $3.44 | 134 | 12.34 | 30.96 | 14.90 | |
FriendliAI | ![]() K-EXAONE | 256k | Open | 32 | $0.00 | 114 | 0.31 | 22.31 | 17.60 | |
SiliconFlow (FP8) | DeepSeek V3.2 (FP8) | 164k | Open | 32 | $0.31 | 44 | 12.69 | 24.09 | N/A | |
DeepSeek | DeepSeek V3.2 | 128k | Open | 32 | $0.32 | 31 | 1.24 | 17.58 | N/A | |
Google Vertex | DeepSeek V3.2 Vertex | 164k | Open | 32 | $0.84 | 55 | 0.51 | 9.64 | N/A | |
Baseten | DeepSeek V3.2 | 164k | Open | 32 | $0.34 | 54 | 4.06 | 13.40 | N/A | |
Fireworks | DeepSeek V3.2 | 164k | Open | 32 | $0.84 | 224 | 2.01 | 4.24 | N/A | |
Novita | DeepSeek V3.2 | 164k | Open | 32 | $0.32 | 42 | 1.06 | 12.88 | N/A | |
GMI | DeepSeek V3.2 | 164k | Open | 32 | $0.12 | 118 | 1.24 | 5.47 | N/A | |
![]() DeepInfra | DeepSeek V3.2 | 164k | Open | 32 | $0.29 | 16 | 0.46 | 31.24 | N/A | |
Alibaba Cloud | Qwen3 Max | 258k | Proprietary | 31 | $2.40 | 26 | 1.84 | 21.11 | N/A | |
Novita | Qwen3 Max | 262k | Proprietary | 31 | $3.69 | 25 | 1.00 | 21.05 | N/A | |
Amazon Bedrock | Claude 4.5 Haiku | 200k | Proprietary | 30 | $2.00 | 92 | 0.61 | 6.02 | N/A | |
Google Vertex | Claude 4.5 Haiku Vertex | 200k | Proprietary | 30 | $2.00 | 89 | 0.53 | 6.15 | N/A | |
Anthropic | Claude 4.5 Haiku | 200k | Proprietary | 30 | $2.00 | 104 | 0.42 | 5.23 | N/A | |
Xiaomi | MiMo-V2-Flash | 256k | Open | 30 | $0.15 | 90 | 1.48 | 7.05 | N/A | |
Amazon Bedrock | Nova 2.0 Lite (medium) | 1m | Proprietary | 30 | $0.85 | 253 | 15.27 | 25.15 | 7.90 | |
Novita | Qwen3 235B A22B 2507 | 131k | Open | 29 | $0.97 | 65 | 0.76 | 39.01 | 30.60 | |
Alibaba Cloud | Qwen3 235B A22B 2507 | 131k | Open | 29 | $2.63 | 84 | 1.20 | 30.89 | 23.75 | |
Fireworks | Qwen3 235B A22B 2507 | 262k | Open | 29 | $0.39 | 154 | 0.87 | 17.07 | 12.96 | |
Nebius | Qwen3 235B A22B 2507 | 262k | Open | 29 | $0.35 | 49 | 0.70 | 51.98 | 41.02 | |
Hyperbolic (FP8) | Qwen3 235B A22B 2507 (FP8) | 41k | Open | 29 | $0.40 | 89 | 0.99 | 29.13 | 22.51 | |
![]() DeepInfra (FP8) | Qwen3 235B A22B 2507 (FP8) | 262k | Open | 29 | $0.77 | 45 | 0.33 | 56.05 | 44.58 | |
Fireworks | DeepSeek V3.1 Terminus | 164k | Open | 28 | $0.84 | 142 | 0.91 | 4.42 | N/A | |
![]() SambaNova | DeepSeek V3.1 Terminus | 128k | Open | 28 | $3.38 | 276 | 1.16 | 2.97 | N/A | |
Novita (FP8) | DeepSeek V3.1 Terminus (FP8) | 131k | Open | 28 | $0.45 | 41 | 1.25 | 13.58 | N/A | |
Eigen AI | DeepSeek V3.1 Terminus | 128k | Open | 28 | $0.80 | 153 | 1.67 | 4.94 | N/A | |
![]() DeepInfra (FP4) | DeepSeek V3.1 Terminus (FP4) | 164k | Open | 28 | $0.35 | 18 | 0.54 | 28.02 | N/A | |
Parasail | ![]() Kimi K2 0905 | 262k | Open | 28 | $1.49 | 41 | 0.56 | 12.62 | N/A | |
Fireworks | ![]() Kimi K2 0905 | 262k | Open | 28 | $1.20 | 178 | 0.40 | 3.21 | N/A | |
![]() DeepInfra | ![]() Kimi K2 0905 | 131k | Open | 28 | $0.80 | 31 | 0.64 | 16.72 | N/A | |
Baseten (FP4) | ![]() Kimi K2 0905 (FP4) | 262k | Open | 28 | $1.07 | 104 | 0.70 | 5.52 | N/A | |
![]() Groq | ![]() Kimi K2 0905 | 262k | Open | 28 | $1.50 | 287 | 0.33 | 2.07 | N/A | |
Together.ai | ![]() Kimi K2 0905 | 262k | Open | 28 | $1.50 | 51 | 0.54 | 10.41 | N/A | |
Novita | ![]() Kimi K2 0905 | 262k | Open | 28 | $1.07 | 25 | 0.83 | 21.04 | N/A | |
Together.ai | Apriel-v1.6-15B-Thinker | 131k | Open | 28 | $0.00 | 149 | 0.28 | 17.05 | 13.42 | |
Alibaba Cloud | Qwen3 VL 235B A22B | 131k | Open | 27 | $2.63 | 43 | 1.10 | 59.78 | 46.94 | |
Fireworks | Qwen3 VL 235B A22B | 262k | Open | 27 | $0.39 | 46 | 0.45 | 54.63 | 43.34 | |
Novita | Qwen3 VL 235B A22B | 131k | Open | 27 | $1.72 | 36 | 1.01 | 70.60 | 55.67 | |
![]() Mistral | ![]() Magistral Medium 1.2 | 128k | Proprietary | 27 | $2.75 | 37 | 0.49 | 67.70 | 53.77 | |
Nebius | DeepSeek R1 0528 | 164k | Open | 27 | $1.20 | 23 | 0.64 | 110.58 | 87.96 | |
Hyperbolic | DeepSeek R1 0528 | 164k | Open | 27 | $3.00 | 69 | 0.96 | 37.00 | 28.84 | |
![]() DeepInfra | DeepSeek R1 0528 | 164k | Open | 27 | $0.91 | 61 | 0.30 | 41.02 | 32.57 | |
![]() SambaNova | DeepSeek R1 0528 | 131k | Open | 27 | $5.50 | 167 | 2.64 | 17.61 | 11.98 | |
Together.ai (Throughput) | DeepSeek R1 0528 (Throughput) | 164k | Open | 27 | $0.96 | 68 | 0.71 | 37.29 | 29.26 | |
Google Vertex | DeepSeek R1 0528 Vertex | 164k | Open | 27 | $2.36 | 215 | 0.31 | 11.93 | 9.29 | |
Together.ai | DeepSeek R1 0528 | 164k | Open | 27 | $4.00 | 310 | 0.39 | 8.45 | 6.44 | |
Nebius Fast, FP4 | DeepSeek R1 0528 Fast, FP4 | 164k | Open | 27 | $3.00 | 192 | 1.15 | 14.20 | 10.44 | |
Novita | DeepSeek R1 0528 | 164k | Open | 27 | $1.15 | 30 | 1.58 | 84.29 | 66.17 | |
Microsoft Azure | DeepSeek R1 0528 | 128k | Open | 27 | $2.36 | 105 | 0.82 | 24.58 | 19.00 | |
![]() Databricks | GPT-5 nano (high) | 400k | Proprietary | 27 | $0.14 | 116 | 139.05 | 143.36 | N/A | |
OpenAI | GPT-5 nano (high) | 400k | Proprietary | 27 | $0.14 | 121 | 112.98 | 117.11 | N/A | |
Microsoft Azure | GPT-5 nano (high) | 400k | Proprietary | 27 | $0.14 | 187 | 83.77 | 86.44 | N/A | |
Google Vertex | Qwen3 Next 80B A3B Vertex | 262k | Open | 27 | $0.41 | 174 | 0.31 | 14.65 | 11.47 | |
Novita | Qwen3 Next 80B A3B | 131k | Open | 27 | $0.49 | 211 | 1.03 | 12.87 | 9.47 | |
Alibaba Cloud | Qwen3 Next 80B A3B | 262k | Open | 27 | $1.88 | 177 | 1.07 | 15.23 | 11.33 | |
Hyperbolic | Qwen3 Next 80B A3B | 262k | Open | 27 | $0.30 | 332 | 0.49 | 8.02 | 6.02 | |
Nebius (FP8) | Qwen3 Next 80B A3B (FP8) | 262k | Open | 27 | $0.41 | 120 | 0.61 | 21.45 | 16.67 | |
Amazon Bedrock | Nova 2.0 Lite (low) | 1m | Proprietary | 25 | $0.85 | 235 | 7.47 | 18.12 | 8.52 | |
![]() DeepInfra (FP8) | Qwen3 Coder 480B (FP8) | 262k | Open | 25 | $0.70 | 25 | 15.56 | 35.45 | N/A | |
Hyperbolic (FP8) | Qwen3 Coder 480B (FP8) | 262k | Open | 25 | $2.00 | 77 | 0.81 | 7.32 | N/A | |
Baseten (FP8) | Qwen3 Coder 480B (FP8) | 262k | Open | 25 | $0.67 | 76 | 0.67 | 7.29 | N/A | |
Together.ai (FP8) | Qwen3 Coder 480B (FP8) | 262k | Open | 25 | $2.00 | 157 | 0.40 | 3.58 | N/A | |
Alibaba Cloud | Qwen3 Coder 480B | 262k | Open | 25 | $3.00 | 48 | 1.71 | 12.05 | N/A | |
Nebius | Qwen3 Coder 480B | 262k | Open | 25 | $0.75 | 60 | 0.61 | 9.02 | N/A | |
Amazon Bedrock | Qwen3 Coder 480B | 262k | Open | 25 | $0.61 | 37 | 0.64 | 14.28 | N/A | |
Google Vertex | Qwen3 Coder 480B Vertex | 262k | Open | 25 | $0.61 | 171 | 0.32 | 3.25 | N/A | |
![]() DeepInfra (Turbo, FP4) | Qwen3 Coder 480B (Turbo, FP4) | 262k | Open | 25 | $0.51 | 77 | 0.23 | 6.72 | N/A | |
Novita | Qwen3 Coder 480B | 262k | Open | 25 | $0.55 | 37 | 0.95 | 14.51 | N/A | |
Novita | gpt-oss-20B (high) | 131k | Open | 25 | $0.07 | 258 | 1.03 | 10.73 | 7.76 | |
Google Vertex | gpt-oss-20B (high) Vertex | 131k | Open | 25 | $0.12 | 427 | 0.16 | 6.01 | 4.68 | |
Amazon Bedrock | gpt-oss-20B (high) | 131k | Open | 25 | $0.13 | 298 | 26.80 | 35.18 | 6.70 | |
![]() Databricks | gpt-oss-20B (high) | 131k | Open | 25 | $0.13 | 229 | 0.40 | 11.34 | 8.75 | |
![]() Groq | gpt-oss-20B (high) | 131k | Open | 25 | $0.13 | 928 | 0.17 | 2.86 | 2.15 | |
Nebius Base | gpt-oss-20B (high) Base | 128k | Open | 25 | $0.09 | 396 | 0.53 | 6.84 | 5.05 | |
![]() DeepInfra | gpt-oss-20B (high) | 131k | Open | 25 | $0.06 | 190 | 0.18 | 13.36 | 10.55 | |
Together.ai | gpt-oss-20B (high) | 131k | Open | 25 | $0.09 | 129 | 0.47 | 19.80 | 15.46 | |
Cloudflare | gpt-oss-20B (high) | 128k | Open | 25 | $0.23 | 138 | 71.76 | 89.81 | 14.45 | |
Hyperbolic | gpt-oss-20B (high) | 131k | Open | 25 | $0.10 | 116 | 0.66 | 22.18 | 17.21 | |
Lightning AI | gpt-oss-20B (high) | 128k | Open | 25 | $0.09 | 309 | 0.39 | 8.48 | 6.47 | |
![]() DeepInfra | NVIDIA Nemotron 3 Nano | 262k | Open | 25 | $0.10 | 166 | 0.33 | 15.42 | 12.07 | |
Parasail | Qwen3 235B 2507 | 262k | Open | 24 | $0.33 | 61 | 0.48 | 8.74 | N/A | |
Scaleway | Qwen3 235B 2507 | 250k | Open | 24 | $1.31 | 77 | 0.73 | 7.27 | N/A | |
Nebius | Qwen3 235B 2507 | 262k | Open | 24 | $0.30 | 124 | 0.60 | 4.64 | N/A | |
Alibaba Cloud | Qwen3 235B 2507 | 131k | Open | 24 | $1.23 | 41 | 1.08 | 13.31 | N/A | |
Together.ai (FP8) | Qwen3 235B 2507 (FP8) | 262k | Open | 24 | $0.30 | 252 | 0.37 | 2.35 | N/A | |
Fireworks (FP8) | Qwen3 235B 2507 (FP8) | 262k | Open | 24 | $0.39 | 42 | 0.91 | 12.82 | N/A | |
![]() Cerebras | Qwen3 235B 2507 | 131k | Open | 24 | $0.75 | 1,373 | 0.29 | 0.65 | N/A | |
Hyperbolic | Qwen3 235B 2507 | 262k | Open | 24 | $2.00 | 29 | 0.69 | 17.66 | N/A | |
Baseten (FP8) | Qwen3 235B 2507 (FP8) | 262k | Open | 24 | $0.36 | 85 | 0.68 | 6.57 | N/A | |
Amazon Bedrock | Qwen3 235B 2507 | 256k | Open | 24 | $0.39 | 60 | 0.64 | 9.02 | N/A | |
![]() DeepInfra | Qwen3 235B 2507 | 262k | Open | 24 | $0.17 | 21 | 0.38 | 23.69 | N/A | |
Google Vertex | Qwen3 235B 2507 Vertex | 262k | Open | 24 | $0.39 | 121 | 0.37 | 4.49 | N/A | |
Novita | Qwen3 235B 2507 | 131k | Open | 24 | $0.21 | 29 | 1.40 | 18.87 | N/A | |
Amazon Bedrock | Nova 2.0 Omni (low) | 1m | Proprietary | 24 | $0.85 | 238 | 1.94 | 12.46 | 8.42 | |
Novita | gpt-oss-120B (low) | 131k | Open | 24 | $0.20 | 97 | 0.43 | 26.31 | 20.71 | |
Baseten | gpt-oss-120B (low) | 128k | Open | 24 | $0.20 | 242 | 0.59 | 10.91 | 8.25 | |
Lightning AI | gpt-oss-120B (low) | 128k | Open | 24 | $0.17 | 732 | 0.17 | 3.58 | 2.73 | |
Amazon Bedrock | gpt-oss-120B (low) | 131k | Open | 24 | $0.26 | 206 | 0.58 | 12.72 | 9.71 | |
Parasail | gpt-oss-120B (low) | 131k | Open | 24 | $0.26 | 350 | 0.40 | 7.54 | 5.71 | |
![]() Databricks | gpt-oss-120B (low) | 128k | Open | 24 | $0.26 | 165 | 0.44 | 15.58 | 12.11 | |
Together.ai | gpt-oss-120B (low) | 131k | Open | 24 | $0.26 | 884 | 0.88 | 3.70 | 2.26 | |
Hyperbolic | gpt-oss-120B (low) | 131k | Open | 24 | $0.30 | 422 | 0.64 | 6.57 | 4.74 | |
Eigen AI | gpt-oss-120B (low) | 131k | Open | 24 | $0.20 | 732 | 1.71 | 5.13 | 2.73 | |
Fireworks | gpt-oss-120B (low) | 131k | Open | 24 | $0.26 | 848 | 0.29 | 3.24 | 2.36 | |
![]() Groq | gpt-oss-120B (low) | 131k | Open | 24 | $0.26 | 465 | 0.16 | 5.54 | 4.30 | |
![]() Cerebras | gpt-oss-120B (low) | 131k | Open | 24 | $0.45 | 2,757 | 0.24 | 1.14 | 0.73 | |
Clarifai | gpt-oss-120B (low) | 131k | Open | 24 | $0.16 | 274 | 0.23 | 9.36 | 7.30 | |
![]() SambaNova | gpt-oss-120B (low) | 131k | Open | 24 | $0.31 | 608 | 0.94 | 5.05 | 3.29 | |
Nebius Base | gpt-oss-120B (low) Base | 128k | Open | 24 | $0.26 | 219 | 0.53 | 11.93 | 9.12 | |
Snowflake | gpt-oss-120B (low) | 131k | Open | 24 | $0.22 | 302 | 0.56 | 8.83 | 6.62 | |
Google Vertex | gpt-oss-120B (low) Vertex | 131k | Open | 24 | $0.16 | 231 | 0.26 | 11.07 | 8.65 | |
Cloudflare | gpt-oss-120B (low) | 128k | Open | 24 | $0.45 | 110 | 10.09 | 32.77 | 18.14 | |
Microsoft Azure | gpt-oss-120B (low) | 131k | Open | 24 | $0.26 | 317 | 0.39 | 8.28 | 6.31 | |
SiliconFlow | GLM-4.6V | 128k | Open | 24 | $0.45 | 24 | 1.36 | 104.97 | 82.89 | |
![]() DeepInfra (FP8) | GLM-4.6V (FP8) | 131k | Open | 24 | $0.45 | 45 | 0.27 | 55.65 | 44.30 | |
Parasail (FP8) | GLM-4.6V (FP8) | 131k | Open | 24 | $0.45 | 102 | 0.62 | 25.20 | 19.67 | |
Novita | GLM-4.6V | 131k | Open | 24 | $0.45 | 58 | 0.76 | 44.17 | 34.73 | |
xAI | Grok 4.1 Fast | 2m | Proprietary | 23 | $0.28 | 137 | 0.84 | 4.49 | N/A | |
![]() DeepInfra | GLM-4.5-Air | 131k | Open | 23 | $0.42 | 188 | 0.18 | 13.45 | 10.61 | |
Nebius Base | GLM-4.5-Air Base | 128k | Open | 23 | $0.45 | 96 | 0.56 | 26.72 | 20.93 | |
SiliconFlow | GLM-4.5-Air | 128k | Open | 23 | $0.32 | 81 | 1.37 | 32.35 | 24.79 | |
FriendliAI | ![]() K-EXAONE | 256k | Open | 23 | $0.00 | 104 | 0.30 | 5.10 | N/A | |
Amazon Bedrock | Nova 2.0 Pro Preview | 256k | Proprietary | 23 | $3.44 | 163 | 0.48 | 3.56 | N/A | |
Microsoft Azure | Grok 4 Fast | 2m | Proprietary | 23 | $0.28 | 113 | 0.44 | 4.88 | N/A | |
xAI | Grok 4 Fast | 2m | Proprietary | 23 | $0.28 | 129 | 0.59 | 4.46 | N/A | |
Nebius | Qwen3 30B A3B 2507 | 262k | Open | 23 | $0.15 | 113 | 0.63 | 22.66 | 17.63 | |
Alibaba Cloud | Qwen3 30B A3B 2507 | 262k | Open | 23 | $0.75 | 164 | 0.99 | 16.24 | 12.20 | |
Clarifai | Qwen3 30B A3B 2507 | 262k | Open | 23 | $0.59 | 141 | 0.24 | 17.99 | 14.20 | |
Amazon Bedrock | ![]() Mistral Large 3 | 256k | Open | 22 | $0.75 | 89 | 0.63 | 6.28 | N/A | |
![]() Mistral | ![]() Mistral Large 3 | 256k | Open | 22 | $0.75 | 48 | 0.53 | 10.84 | N/A | |
Nebius (FP8) | INTELLECT-3 (FP8) | 128k | Open | 22 | $0.42 | 85 | 0.57 | 30.04 | 23.57 | |
Google (AI Studio) | Gemini 2.5 Flash-Lite (Sep) (AI Studio) | 1m | Proprietary | 22 | $0.17 | 550 | 4.22 | 5.13 | N/A | |
![]() Mistral | ![]() Devstral 2 | 256k | Open | 22 | $0.00 | 53 | 0.45 | 9.82 | N/A | |
![]() Mistral | ![]() Mistral Medium 3.1 | 131k | Proprietary | 21 | $0.80 | 100 | 0.40 | 5.38 | N/A | |
Google Vertex | gpt-oss-20B (low) Vertex | 131k | Open | 21 | $0.12 | 455 | 0.16 | 5.65 | 4.40 | |
Amazon Bedrock | gpt-oss-20B (low) | 131k | Open | 21 | $0.13 | 187 | 5.47 | 18.82 | 10.67 | |
Novita | gpt-oss-20B (low) | 131k | Open | 21 | $0.07 | 273 | 1.20 | 10.34 | 7.32 | |
Together.ai | gpt-oss-20B (low) | 131k | Open | 21 | $0.09 | 815 | 0.81 | 3.88 | 2.45 | |
![]() Databricks | gpt-oss-20B (low) | 131k | Open | 21 | $0.13 | 231 | 0.40 | 11.22 | 8.66 | |
![]() Groq | gpt-oss-20B (low) | 131k | Open | 21 | $0.13 | 914 | 0.17 | 2.91 | 2.19 | |
Nebius Base | gpt-oss-20B (low) Base | 128k | Open | 21 | $0.09 | 251 | 0.58 | 10.53 | 7.96 | |
Cloudflare | gpt-oss-20B (low) | 128k | Open | 21 | $0.23 | 113 | 9.28 | 31.46 | 17.74 | |
Lightning AI | gpt-oss-20B (low) | 128k | Open | 21 | $0.09 | 308 | 0.40 | 8.51 | 6.49 | |
Hyperbolic | gpt-oss-20B (low) | 131k | Open | 21 | $0.10 | 112 | 0.66 | 22.95 | 17.84 | |
![]() DeepInfra (FP8) | Qwen3 VL 235B A22B (FP8) | 262k | Open | 21 | $0.45 | 23 | 0.30 | 21.93 | N/A | |
Novita | Qwen3 VL 235B A22B | 131k | Open | 21 | $0.60 | 30 | 0.83 | 17.32 | N/A | |
Alibaba Cloud | Qwen3 VL 235B A22B | 131k | Open | 21 | $1.23 | 36 | 1.08 | 14.88 | N/A | |
Parasail (FP8) | Qwen3 VL 235B A22B (FP8) | 131k | Open | 21 | $1.00 | 32 | 0.53 | 16.22 | N/A | |
Eigen AI | Qwen3 VL 235B A22B | 262k | Open | 21 | $1.00 | 70 | 1.65 | 8.75 | N/A | |
Fireworks | Qwen3 VL 235B A22B | 262k | Open | 21 | $0.39 | 47 | 0.63 | 11.28 | N/A | |
GMI (FP8) | Qwen3 VL 235B A22B (FP8) | 262k | Open | 21 | $0.23 | 71 | 1.36 | 8.40 | N/A | |
Hyperbolic | Qwen3 Next 80B A3B | 262k | Open | 20 | $0.30 | 221 | 0.46 | 2.73 | N/A | |
Novita | Qwen3 Next 80B A3B | 131k | Open | 20 | $0.49 | 116 | 0.89 | 5.19 | N/A | |
Parasail | Qwen3 Next 80B A3B | 262k | Open | 20 | $0.46 | 96 | 0.46 | 5.69 | N/A | |
Alibaba Cloud | Qwen3 Next 80B A3B | 131k | Open | 20 | $0.88 | 174 | 1.13 | 3.99 | N/A | |
Google Vertex | Qwen3 Next 80B A3B Vertex | 262k | Open | 20 | $0.41 | 240 | 0.28 | 2.36 | N/A | |
![]() DeepInfra | Qwen3 Next 80B A3B | 262k | Open | 20 | $0.34 | 122 | 0.32 | 4.43 | N/A | |
GMI | Qwen3 Next 80B A3B | 262k | Open | 20 | $0.20 | 228 | 1.25 | 3.45 | N/A | |
Nebius | Qwen3 Coder 30B A3B | 262k | Open | 20 | $0.15 | 73 | 0.55 | 7.43 | N/A | |
Alibaba Cloud | Qwen3 Coder 30B A3B | 262k | Open | 20 | $0.90 | 99 | 1.46 | 6.50 | N/A | |
![]() DeepInfra (FP8) | Qwen3 Coder 30B A3B (FP8) | 262k | Open | 20 | $0.12 | 42 | 0.22 | 12.22 | N/A | |
Scaleway | Qwen3 Coder 30B A3B | 128k | Open | 20 | $0.41 | 88 | 0.62 | 6.29 | N/A | |
Amazon Bedrock | Qwen3 Coder 30B A3B | 262k | Open | 20 | $0.26 | 112 | 0.64 | 5.10 | N/A | |
Google (AI Studio) | Gemini 2.5 Flash-Lite (Sep) (AI Studio) | 1m | Proprietary | 20 | $0.17 | 455 | 0.30 | 1.40 | N/A | |
Alibaba Cloud | Qwen3 VL 30B A3B | 131k | Open | 20 | $0.75 | 103 | 0.91 | 25.09 | 19.34 | |
Fireworks | Qwen3 VL 30B A3B | 262k | Open | 20 | $0.50 | 153 | 0.53 | 16.87 | 13.07 | |
Novita | Qwen3 VL 30B A3B | 131k | Open | 20 | $0.40 | 96 | 0.79 | 26.89 | 20.87 | |
![]() Mistral | ![]() Devstral Small 2 | 256k | Open | 19 | $0.00 | 196 | 0.35 | 2.90 | N/A | |
![]() DeepInfra | Llama Nemotron Super 49B v1.5 | 131k | Open | 19 | $0.17 | 77 | 0.23 | 32.77 | 26.03 | |
Amazon Bedrock | Nova Premier | 1m | Proprietary | 19 | $5.00 | 80 | 0.80 | 7.08 | N/A | |
![]() Mistral | ![]() Devstral Medium | 131k | Proprietary | 19 | $0.80 | 113 | 0.44 | 4.86 | N/A | |
Parasail (FP8) | Llama 4 Maverick (FP8) | 1m | Open | 19 | $0.35 | 138 | 0.37 | 4.00 | N/A | |
Google Vertex | Llama 4 Maverick Vertex | 524k | Open | 19 | $0.55 | 244 | 0.38 | 2.43 | N/A | |
![]() DeepInfra (Turbo, FP8) | Llama 4 Maverick (Turbo, FP8) | 8k | Open | 19 | $0.50 | 60 | 0.29 | 8.65 | N/A | |
Together.ai | Llama 4 Maverick | 1m | Open | 19 | $0.41 | 71 | 0.41 | 7.49 | N/A | |
Microsoft Azure (FP8) | Llama 4 Maverick (FP8) | 128k | Open | 19 | $0.61 | 245 | 3.77 | 5.81 | N/A | |
Amazon Bedrock | Llama 4 Maverick | 128k | Open | 19 | $0.42 | 221 | 0.49 | 2.75 | N/A | |
![]() Databricks | Llama 4 Maverick | 131k | Open | 19 | $0.75 | 100 | 0.61 | 5.60 | N/A | |
![]() DeepInfra (FP8) | Llama 4 Maverick (FP8) | 1m | Open | 19 | $0.26 | 60 | 0.24 | 8.56 | N/A | |
![]() SambaNova | Llama 4 Maverick | 131k | Open | 19 | $0.92 | 688 | 0.80 | 1.53 | N/A | |
![]() Groq | Llama 4 Maverick | 131k | Open | 19 | $0.30 | 453 | 0.64 | 1.74 | N/A | |
Snowflake | Llama 4 Maverick | 131k | Open | 19 | $0.50 | 130 | 0.46 | 4.31 | N/A | |
Novita (FP8) | Llama 4 Maverick (FP8) | 1m | Open | 19 | $0.41 | 60 | 0.47 | 8.79 | N/A | |
Amazon Bedrock | Nova 2.0 Lite | 1m | Proprietary | 18 | $0.85 | 223 | 0.52 | 2.76 | N/A | |
SiliconFlow | GLM-4.6V | 128k | Open | 17 | $0.45 | 26 | 55.50 | 74.90 | N/A | |
Parasail (FP8) | GLM-4.6V (FP8) | 131k | Open | 17 | $0.45 | 99 | 0.59 | 5.64 | N/A | |
Novita | GLM-4.6V | 131k | Open | 17 | $0.45 | 57 | 0.75 | 9.55 | N/A | |
Alibaba Cloud | Qwen3 VL 32B | 131k | Open | 17 | $1.23 | 46 | 0.96 | 11.92 | N/A | |
Together.ai | Qwen3 VL 32B | 262k | Open | 17 | $0.75 | 54 | 0.94 | 10.20 | N/A | |
FriendliAI | ![]() EXAONE 4.0 32B | 131k | Open | 17 | $0.70 | 100 | 0.31 | 25.21 | 19.93 | |
Amazon Bedrock | Nova 2.0 Omni | 1m | Proprietary | 17 | $0.85 | 226 | 0.74 | 2.95 | N/A | |
Alibaba Cloud | Qwen3 VL 8B | 131k | Open | 17 | $0.66 | 65 | 0.94 | 39.29 | 30.68 | |
Fireworks | Qwen3 VL 30B A3B | 262k | Open | 16 | $0.50 | 114 | 0.47 | 4.85 | N/A | |
Novita | Qwen3 VL 30B A3B | 131k | Open | 16 | $0.33 | 86 | 0.71 | 6.50 | N/A | |
![]() DeepInfra (FP8) | Qwen3 VL 30B A3B (FP8) | 262k | Open | 16 | $0.26 | 45 | 0.23 | 11.25 | N/A | |
Alibaba Cloud | Qwen3 VL 30B A3B | 131k | Open | 16 | $0.35 | 98 | 0.91 | 6.03 | N/A | |
![]() Mistral | ![]() Ministral 14B (Dec '25) | 256k | Open | 16 | $0.20 | 142 | 0.29 | 3.80 | N/A | |
Amazon Bedrock | ![]() Ministral 14B (Dec '25) | 256k | Open | 16 | $0.20 | 170 | 0.58 | 3.52 | N/A | |
Alibaba Cloud | Qwen3 Omni 30B A3B | 66k | Open | 16 | $0.43 | 97 | 0.95 | 26.85 | 20.72 | |
![]() Mistral | ![]() Devstral Small | 131k | Open | 15 | $0.15 | 228 | 0.36 | 2.56 | N/A | |
![]() DeepInfra | ![]() Devstral Small | 128k | Open | 15 | $0.12 | 54 | 0.27 | 9.55 | N/A | |
Nebius | Qwen3 30B A3B 2507 | 262k | Open | 15 | $0.15 | 93 | 0.61 | 5.97 | N/A | |
Alibaba Cloud | Qwen3 30B A3B 2507 | 262k | Open | 15 | $0.35 | 76 | 1.03 | 7.57 | N/A | |
Clarifai | Qwen3 30B A3B 2507 | 262k | Open | 15 | $0.35 | 106 | 0.26 | 4.99 | N/A | |
![]() DeepInfra | NVIDIA Nemotron Nano 9B V2 | 131k | Open | 15 | $0.07 | 102 | 0.21 | 24.77 | 19.65 | |
![]() DeepInfra | Llama Nemotron Super 49B v1.5 | 131k | Open | 15 | $0.17 | 70 | 0.22 | 7.39 | N/A | |
![]() DeepInfra (FP8) | ![]() Mistral Small 3.2 (FP8) | 128k | Open | 15 | $0.11 | 59 | 0.24 | 8.67 | N/A | |
![]() Mistral | ![]() Mistral Small 3.2 | 131k | Open | 15 | $0.15 | 101 | 0.29 | 5.23 | N/A | |
![]() Mistral | ![]() Ministral 8B (Dec '25) | 256k | Open | 15 | $0.15 | 194 | 0.27 | 2.85 | N/A | |
Amazon Bedrock | ![]() Ministral 8B (Dec '25) | 256k | Open | 15 | $0.15 | 229 | 0.55 | 2.74 | N/A | |
![]() SambaNova | Llama 3.3 70B | 128k | Open | 15 | $0.75 | 366 | 0.44 | 1.80 | N/A | |
Parasail (FP8) | Llama 3.3 70B (FP8) | 131k | Open | 15 | $0.28 | 32 | 0.49 | 15.99 | N/A | |
Amazon Bedrock | Llama 3.3 70B | 128k | Open | 15 | $0.71 | 148 | 0.45 | 3.82 | N/A | |
![]() Groq | Llama 3.3 70B | 131k | Open | 15 | $0.64 | 347 | 0.18 | 1.62 | N/A | |
Nebius Fast | Llama 3.3 70B Fast | 128k | Open | 15 | $0.38 | 131 | 0.58 | 4.39 | N/A | |
Nebius Base | Llama 3.3 70B Base | 128k | Open | 15 | $0.20 | 19 | 0.77 | 27.13 | N/A | |
Snowflake Snowflake | Llama 3.3 70B Snowflake | 8k | Open | 15 | $0.58 | 137 | 0.46 | 4.12 | N/A | |
Fireworks | Llama 3.3 70B | 131k | Open | 15 | $0.90 | 123 | 0.47 | 4.53 | N/A | |
FriendliAI | Llama 3.3 70B | 128k | Open | 15 | $0.60 | 102 | 0.29 | 5.21 | N/A | |
![]() Cerebras | Llama 3.3 70B | 128k | Open | 15 | $0.94 | 2,106 | 0.34 | 0.57 | N/A | |
Hyperbolic | Llama 3.3 70B | 131k | Open | 15 | $0.40 | 101 | 0.77 | 5.70 | N/A | |
Google Vertex | Llama 3.3 70B Vertex | 128k | Open | 15 | $0.72 | 158 | 0.19 | 3.36 | N/A | |
Cloudflare | Llama 3.3 70B | 24k | Open | 15 | $0.78 | 39 | 0.54 | 13.28 | N/A | |
CompactifAI | Llama 3.3 70B | 128k | Open | 15 | $0.40 | 144 | 0.29 | 3.76 | N/A | |
![]() Databricks | Llama 3.3 70B | 128k | Open | 15 | $0.75 | 67 | 0.42 | 7.84 | N/A | |
Together.ai Turbo | Llama 3.3 70B Turbo | 131k | Open | 15 | $0.88 | 106 | 0.50 | 5.24 | N/A | |
Lightning AI | Llama 3.3 70B | 128k | Open | 15 | $0.30 | 150 | 0.19 | 3.54 | N/A | |
Microsoft Azure | Llama 3.3 70B | 128k | Open | 15 | $0.71 | 44 | 1.58 | 12.99 | N/A | |
Scaleway | Llama 3.3 70B | 100k | Open | 15 | $1.05 | 59 | 0.68 | 9.16 | N/A | |
![]() DeepInfra (Turbo, FP8) | Llama 3.3 70B (Turbo, FP8) | 131k | Open | 15 | $0.15 | 26 | 0.33 | 19.55 | N/A | |
Novita | Llama 3.3 70B | 131k | Open | 15 | $0.20 | 26 | 0.83 | 19.86 | N/A | |
![]() DeepInfra (FP8) | NVIDIA Nemotron Nano 12B v2 VL (FP8) | 131k | Open | 15 | $0.30 | 129 | 0.20 | 19.53 | 15.46 | |
Alibaba Cloud | Qwen3 VL 8B | 131k | Open | 15 | $0.31 | 109 | 0.89 | 5.48 | N/A | |
Together.ai Turbo | Llama 3.1 405B Turbo | 10k | Open | 14 | $3.50 | 25 | 0.52 | 20.72 | N/A | |
Amazon Bedrock Latency Optimized | Llama 3.1 405B Latency Optimized | 128k | Open | 14 | $3.00 | 73 | 0.44 | 7.27 | N/A | |
Amazon Bedrock Standard | Llama 3.1 405B Standard | 128k | Open | 14 | $2.40 | 24 | 1.80 | 22.70 | N/A | |
![]() Replicate | Llama 3.1 405B | 128k | Open | 14 | $9.50 | 24 | 1.05 | 21.66 | N/A | |
Google Vertex | Llama 3.1 405B Vertex | 128k | Open | 14 | $7.75 | 25 | 0.39 | 20.65 | N/A | |
Hyperbolic | Llama 3.1 405B | 131k | Open | 14 | $4.00 | 25 | 0.90 | 21.29 | N/A | |
![]() Databricks | Llama 3.1 405B | 128k | Open | 14 | $4.38 | 35 | 0.81 | 15.04 | N/A | |
Microsoft Azure | Llama 3.1 405B | 128k | Open | 14 | $8.00 | 25 | 0.45 | 20.32 | N/A | |
Google Vertex | Llama 4 Scout Vertex | 1m | Open | 14 | $0.36 | 179 | 0.41 | 3.21 | N/A | |
Microsoft Azure | Llama 4 Scout | 128k | Open | 14 | $0.34 | 209 | 4.03 | 6.43 | N/A | |
CompactifAI | Llama 4 Scout | 10m | Open | 14 | $0.11 | 112 | 0.61 | 5.06 | N/A | |
Amazon Bedrock | Llama 4 Scout | 128k | Open | 14 | $0.29 | 195 | 0.54 | 3.10 | N/A | |
Together.ai | Llama 4 Scout | 1m | Open | 14 | $0.28 | 78 | 0.36 | 6.81 | N/A | |
Cloudflare | Llama 4 Scout | 131k | Open | 14 | $0.41 | 115 | 0.36 | 4.70 | N/A | |
![]() Groq | Llama 4 Scout | 131k | Open | 14 | $0.17 | 429 | 0.22 | 1.39 | N/A | |
![]() DeepInfra | Llama 4 Scout | 328k | Open | 14 | $0.14 | 56 | 0.30 | 9.21 | N/A | |
Novita | Llama 4 Scout | 131k | Open | 14 | $0.28 | 61 | 0.48 | 8.72 | N/A | |
![]() DeepInfra | Llama 3.1 Nemotron 70B | 131k | Open | 14 | $1.20 | 40 | 0.28 | 12.81 | N/A | |
![]() DeepInfra | NVIDIA Nemotron 3 Nano | 262k | Open | 14 | $0.10 | 169 | 0.33 | 3.28 | N/A | |
Microsoft Azure | Command A | 256k | Open | 13 | $4.38 | 45 | 0.50 | 11.72 | N/A | |
Cohere | Command A | 288k | Open | 13 | $4.38 | 58 | 0.30 | 8.91 | N/A | |
Together.ai | NVIDIA Nemotron Nano 9B V2 | 131k | Open | 13 | $0.11 | 113 | 0.24 | 4.69 | N/A | |
![]() DeepInfra | NVIDIA Nemotron Nano 9B V2 | 131k | Open | 13 | $0.07 | 94 | 0.24 | 5.55 | N/A | |
Amazon Bedrock | NVIDIA Nemotron Nano 9B V2 | 131k | Open | 13 | $0.10 | 74 | 0.64 | 7.41 | N/A | |
![]() Mistral | ![]() Ministral 3B (Dec '25) | 256k | Open | 12 | $0.10 | 298 | 0.27 | 1.95 | N/A | |
Amazon Bedrock | ![]() Ministral 3B (Dec '25) | 256k | Open | 12 | $10.00 | 351 | 0.56 | 1.98 | N/A | |
FriendliAI | ![]() EXAONE 4.0 32B | 131k | Open | 12 | $0.70 | 87 | 0.30 | 6.06 | N/A | |
![]() Replicate | Granite 4.0 H Small | 128k | Open | 11 | $0.11 | 239 | 8.88 | 10.98 | N/A | |
Alibaba Cloud | Qwen3 Omni 30B A3B | 66k | Open | 11 | $0.43 | 88 | 0.92 | 6.58 | N/A | |
Google (AI Studio) | Gemma 3 27B (AI Studio) | 128k | Open | 10 | $0.00 | 42 | 3.64 | 15.61 | N/A | |
Novita | Gemma 3 27B | 98k | Open | 10 | $0.14 | 41 | 0.76 | 13.09 | N/A | |
Parasail | Gemma 3 27B | 131k | Open | 10 | $0.29 | 53 | 0.40 | 9.77 | N/A | |
![]() DeepInfra | Gemma 3 27B | 131k | Open | 10 | $0.11 | 35 | 0.33 | 14.44 | N/A | |
Amazon Bedrock | Gemma 3 27B | 128k | Open | 10 | $0.27 | 55 | 0.63 | 9.66 | N/A | |
GMI | Gemma 3 27B | 131k | Open | 10 | $0.04 | 31 | 1.34 | 17.65 | N/A | |
Amazon Bedrock | NVIDIA Nemotron Nano 12B v2 VL | 128k | Open | 10 | $0.30 | 71 | 0.58 | 7.66 | N/A | |
Nebius | NVIDIA Nemotron Nano 12B v2 VL | 128k | Open | 10 | $0.10 | 132 | 0.56 | 4.34 | N/A | |
![]() DeepInfra (FP8) | NVIDIA Nemotron Nano 12B v2 VL (FP8) | 131k | Open | 10 | $0.30 | 126 | 0.20 | 4.17 | N/A | |
AI21 Labs | Jamba 1.7 Large | 256k | Open | 9 | $3.50 | 46 | 0.87 | 11.69 | N/A | |
Google (AI Studio) | Gemma 3 12B (AI Studio) | 128k | Open | 9 | $0.00 | 42 | 28.71 | 40.63 | N/A | |
![]() DeepInfra | Gemma 3 12B | 131k | Open | 9 | $0.06 | 44 | 0.30 | 11.64 | N/A | |
![]() Databricks | Gemma 3 12B | 128k | Open | 9 | $0.24 | 112 | 0.64 | 5.12 | N/A | |
Amazon Bedrock | Gemma 3 12B | 128k | Open | 9 | $0.14 | 77 | 0.56 | 7.07 | N/A | |
Cloudflare | Gemma 3 12B | 80k | Open | 9 | $0.40 | 80 | 0.39 | 6.61 | N/A | |
Parasail | Olmo 3 7B | 66k | Open | 8 | $0.13 | 32 | 0.43 | 16.06 | N/A | |
AI21 Labs | Jamba 1.7 Mini | 258k | Open | 7 | $0.25 | 124 | 0.65 | 4.68 | N/A | |
Alibaba Cloud | Qwen3 1.7B | 31k | Open | 7 | $0.19 | 116 | 0.90 | 5.20 | N/A | |
Google (AI Studio) | Gemma 3 4B (AI Studio) | 128k | Open | 6 | $0.00 | 40 | 0.94 | 13.52 | N/A | |
Amazon Bedrock | Gemma 3 4B | 128k | Open | 6 | $0.05 | 176 | 0.58 | 3.42 | N/A | |
![]() DeepInfra | Gemma 3 4B | 131k | Open | 6 | $0.05 | 50 | 0.33 | 10.25 | N/A | |
Together.ai | Gemma 3n E4B | 33k | Open | 6 | $0.03 | 61 | 0.37 | 8.62 | N/A | |
Alibaba Cloud | Qwen3 0.6B | 33k | Open | 6 | $0.19 | 188 | 0.89 | 3.56 | N/A | |
Microsoft Azure | o3 | 200k | Proprietary | $3.50 | 85 | 36.39 | 42.26 | N/A | ||
OpenAI | o3 | 200k | Proprietary | $3.50 | 270 | 11.80 | 13.66 | N/A | ||
Google Vertex | Llama 3.2 90B (Vision) Vertex | 128k | Open | $0.00 | 21 | 0.18 | 24.43 | N/A | ||
Amazon Bedrock | Llama 3.2 90B (Vision) | 128k | Open | $0.72 | 51 | 0.46 | 10.21 | N/A | ||
Microsoft Azure | Llama 3.2 90B (Vision) | 128k | Open | $2.04 | 34 | 0.33 | 15.11 | N/A | ||
![]() DeepInfra | Llama 3.2 90B (Vision) | 33k | Open | $0.36 | 61 | 0.27 | 8.47 | N/A | ||
Amazon Bedrock | Llama 3.2 11B (Vision) | 128k | Open | $0.16 | 142 | 0.38 | 3.91 | N/A | ||
Microsoft Azure | Llama 3.2 11B (Vision) | 128k | Open | $0.37 | 70 | 0.38 | 7.51 | N/A | ||
![]() DeepInfra | Llama 3.2 11B (Vision) | 131k | Open | $0.05 | 6 | 2.30 | 79.63 | N/A | ||
Google (AI Studio) | Gemma 3 1B (AI Studio) | 32k | Open | $0.00 | 38 | 0.51 | 13.67 | N/A | ||
Google (AI Studio) | Gemma 3n E2B (AI Studio) | 32k | Open | $0.00 | 42 | 0.37 | 12.38 | N/A | ||
![]() Mistral | ![]() Magistral Small 1.2 | 128k | Open | $0.75 | 204 | 0.33 | 12.58 | 9.80 | ||
Amazon Bedrock | ![]() Magistral Small 1.2 | 128k | Open | $0.75 | 43 | 0.60 | 59.16 | 46.85 | ||
![]() SambaNova | DeepSeek R1 Distill Llama 70B | 131k | Open | $0.88 | 355 | 0.82 | 7.87 | 5.64 | ||
![]() DeepInfra | DeepSeek R1 Distill Llama 70B | 131k | Open | $0.75 | 42 | 0.30 | 60.09 | 47.83 | ||
Scaleway | DeepSeek R1 Distill Llama 70B | 32k | Open | $1.05 | 23 | 0.99 | 110.94 | 87.96 | ||
![]() DeepInfra | DeepSeek-OCR | 8k | Open | $0.05 | 319 | 0.17 | 1.74 | N/A | ||
Novita | DeepSeek-OCR | 8k | Open | $0.03 | 330 | 0.75 | 2.27 | N/A | ||
Novita | DeepSeek R1 0528 Qwen3 8B | 128k | Open | $0.07 | 68 | 1.18 | 38.20 | 29.62 | ||
Parasail (FP8) | DeepSeek V3.2 Speciale (FP8) | 164k | Open | $0.42 | 27 | 0.78 | 92.20 | 73.13 | ||
xAI | Grok Code Fast 1 | 256k | Proprietary | $0.53 | 205 | 8.63 | 11.07 | N/A | ||
Amazon Bedrock | Nova Micro | 130k | Proprietary | $0.06 | 462 | 0.36 | 1.44 | N/A | ||
![]() DeepInfra | Phi-4 | 16k | Open | $0.09 | 28 | 0.28 | 17.87 | N/A | ||
Microsoft Azure | Phi-4 | 16k | Open | $0.22 | 7 | 0.83 | 77.03 | N/A | ||
Microsoft Azure | Phi-4 Multimodal | 128k | Open | $0.00 | 17 | 0.32 | 29.26 | N/A | ||
Microsoft Azure | Phi-4 Mini | 128k | Open | $0.00 | 45 | 0.31 | 11.42 | N/A | ||
Nebius Base | Llama Nemotron Ultra Base | 131k | Open | $0.90 | 38 | 0.72 | 66.68 | 52.77 | ||
Parasail | Olmo 3.1 32B Think | 66k | Open | $0.00 | 60 | 0.54 | 42.21 | 33.34 | ||
Parasail | Olmo 3 7B Think | 66k | Open | $0.14 | 110 | 0.52 | 23.21 | 18.15 | ||
Reka AI | Reka Flash 3 | 128k | Open | $0.35 | 49 | 1.32 | 52.36 | 40.83 | ||
Nebius (FP8) | ![]() Hermes 4 405B (FP8) | 128k | Open | $1.50 | 31 | 0.73 | 16.79 | N/A | ||
Nebius (FP8) | ![]() Hermes 4 405B (FP8) | 128k | Open | $1.50 | 33 | 0.77 | 76.24 | 60.38 | ||
Nebius (FP8) | ![]() Hermes 4 70B (FP8) | 128k | Open | $0.20 | 72 | 0.57 | 7.53 | N/A | ||
Nebius (FP8) | ![]() Hermes 4 70B (FP8) | 128k | Open | $0.20 | 80 | 0.60 | 31.88 | 25.03 | ||
Novita | ERNIE 4.5 300B A47B | 123k | Open | $0.48 | 22 | 1.71 | 24.49 | N/A | ||
SiliconFlow | ERNIE 4.5 300B A47B | 131k | Open | $0.48 | 36 | 2.03 | 15.80 | N/A | ||
Together.ai | ![]() Cogito v2.1 | 164k | Open | $1.25 | 74 | 0.31 | 34.03 | 26.98 | ||
Alibaba Cloud | Qwen3 1.7B | 33k | Open | $0.40 | 126 | 0.90 | 20.73 | 15.86 | ||
Alibaba Cloud | Qwen3 0.6B | 33k | Open | $0.40 | 201 | 0.90 | 13.36 | 9.97 | ||
Alibaba Cloud | Qwen3 VL 32B | 131k | Open | $2.63 | 48 | 1.08 | 52.92 | 41.47 | ||
SiliconFlow | ![]() Ling-flash-2.0 | 131k | Open | $0.25 | 56 | 1.51 | 10.51 | N/A | ||
SiliconFlow | ![]() Ring-flash-2.0 | 131k | Open | $0.25 | 86 | 1.38 | 30.53 | 23.32 | ||
SiliconFlow | ![]() Ling-mini-2.0 | 131k | Open | $0.12 | 181 | 1.63 | 4.39 | N/A | ||
Key definitions
Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Time to first token received, in seconds, after API request sent. For reasoning models which share reasoning tokens, this will be the first reasoning token. For models which do not support streaming, this represents time to receive the completion.
Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Price per token generated by the model (received from the API), represented as USD per million Tokens.
Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Metrics are 'live' and are based on the past 72 hours of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.











