Artificial Analysis LLM Performance Leaderboard
Independent performance benchmarks & pricing across API providers of LLMs. Definitions are below the table.
For further analysis and methodology, see artificialanalysis.ai.
For further analysis and methodology, see artificialanalysis.ai.
Features | Price | Output tokens/s | Latency | ||||
---|---|---|---|---|---|---|---|
Further Analysis | |||||||
o1-preview | 128k | 85 | $26.25 | 142.5 | 20.35 | ||
o1-mini | 128k | 82 | $5.25 | 219.6 | 10.69 | ||
GPT-4o (Aug '24) | 128k | 78 | $4.38 | 70.0 | 0.44 | ||
GPT-4o (Aug '24) | 128k | 78 | $4.38 | 53.3 | 0.98 | ||
GPT-4o (May '24) | 128k | 78 | $7.50 | 71.2 | 0.43 | ||
GPT-4o (May '24) | 128k | 78 | $7.50 | 106.4 | 0.74 | ||
GPT-4o mini | 128k | 73 | $0.26 | 76.3 | 0.40 | ||
GPT-4o mini | 128k | 73 | $0.26 | 157.0 | 0.77 | ||
GPT-4o (Nov '24) | 128k | 73 | $4.38 | 121.7 | 0.37 | ||
GPT-4o (Nov '24) | 128k | 73 | $4.38 | 0.0 | 1.04 | ||
Llama 3.3 70B | 33k | 74 | $0.94 | 2,160.5 | 0.25 | ||
Llama 3.3 70B | 128k | 74 | $0.40 | 23.6 | 0.54 | ||
Llama 3.3 70B | 128k | 74 | $0.71 | 31.0 | 0.95 | ||
Llama 3.3 70B Fast | 128k | 74 | $0.38 | 70.7 | 0.59 | ||
Llama 3.3 70B Base | 128k | 74 | $0.20 | 47.9 | 0.63 | ||
Llama 3.3 70B | 128k | 74 | $0.71 | 19.0 | 0.47 | ||
Llama 3.3 70B | 128k | 74 | $0.90 | 111.0 | 0.51 | ||
Llama 3.3 70B (Turbo, FP8) | 128k | 74 | $0.20 | 30.3 | 0.28 | ||
Llama 3.3 70B | 128k | 74 | $0.27 | 29.5 | 0.35 | ||
Llama 3.3 70B (Spec decoding) | 8k | 74 | $0.69 | 1,848.5 | 0.34 | ||
Llama 3.3 70B | 128k | 74 | $0.64 | 275.4 | 0.24 | ||
Llama 3.3 70B | 4k | 74 | $0.75 | 364.3 | 0.44 | ||
Llama 3.3 70B | 128k | 75 | $0.88 | 167.4 | 0.46 | ||
Llama 3.1 405B | 128k | 74 | $9.50 | 18.8 | 0.40 | ||
Llama 3.1 405B | 128k | 75 | $4.00 | 11.5 | 0.79 | ||
Llama 3.1 405B Standard | 128k | 74 | $2.40 | 30.2 | 1.96 | ||
Llama 3.1 405B Latency Optimized | 128k | 74 | $3.00 | 65.0 | 0.80 | ||
Llama 3.1 405B Base | 128k | 74 | $1.50 | 34.0 | 0.73 | ||
Llama 3.1 405B Vertex | 128k | 74 | $7.75 | 29.8 | 0.41 | ||
Llama 3.1 405B | 128k | 74 | $8.00 | 19.4 | 0.54 | ||
Llama 3.1 405B | 128k | 73 | $3.00 | 72.2 | 0.72 | ||
Llama 3.1 405B | 33k | 73 | $0.90 | 21.1 | 0.44 | ||
Llama 3.1 405B | 8k | 74 | $6.25 | 168.5 | 0.75 | ||
Llama 3.1 405B | 128k | 72 | $7.50 | 30.5 | 0.68 | ||
Llama 3.1 405B Turbo | 128k | 74 | $3.50 | 73.4 | 0.85 | ||
Llama 3.1 70B | 33k | 68 | $0.60 | 2,261.5 | 0.24 | ||
Llama 3.1 70B | 128k | 69 | $0.40 | 27.0 | 0.69 | ||
Llama 3.1 70B Standard | 128k | 68 | $0.72 | 31.4 | 0.71 | ||
Llama 3.1 70B Latency Optimized | 128k | 68 | $0.90 | 133.6 | 0.41 | ||
Llama 3.1 70B Base | 128k | 66 | $0.20 | 45.0 | 0.65 | ||
Llama 3.1 70B Fast | 128k | 66 | $0.38 | 72.6 | 0.58 | ||
Llama 3.1 70B Vertex | 128k | 68 | $0.00 | 71.6 | 0.28 | ||
Llama 3.1 70B | 128k | 68 | $2.90 | 30.9 | 0.57 | ||
Llama 3.1 70B | 128k | 67 | $0.90 | 134.2 | 0.41 | ||
Llama 3.1 70B (Turbo, FP8) | 128k | 66 | $0.20 | 42.5 | 0.28 | ||
Llama 3.1 70B | 128k | 66 | $0.27 | 38.8 | 0.29 | ||
Llama 3.1 70B | 128k | 68 | $0.60 | 228.7 | 0.44 | ||
Llama 3.1 70B | 128k | 54 | $0.64 | 275.4 | 0.25 | ||
Llama 3.1 70B (Spec decoding) | 8k | 54 | $0.69 | 1,849.9 | 0.34 | ||
Llama 3.1 70B | 64k | 65 | $0.75 | 370.5 | 0.43 | ||
Llama 3.1 70B | 128k | 43 | $1.50 | 60.8 | 0.56 | ||
Llama 3.1 70B | 128k | 68 | $1.00 | 53.2 | 0.32 | ||
Llama 3.1 70B Turbo | 128k | 68 | $0.88 | 228.2 | 0.35 | ||
Llama 3.1 70B | 128k | 62 | $0.90 | 126.0 | 0.35 | ||
Llama 3.2 90B (Vision) | 128k | 67 | $0.72 | 28.9 | 0.52 | ||
Llama 3.2 90B (Vision) Vertex | 128k | 68 | $0.00 | 33.7 | 0.20 | ||
Llama 3.2 90B (Vision) | 128k | 66 | $0.90 | 67.2 | 0.32 | ||
Llama 3.2 90B (Vision) | 33k | 68 | $0.36 | 38.7 | 0.28 | ||
Llama 3.2 90B (Vision) | 8k | 67 | $0.90 | 266.9 | 0.38 | ||
Llama 3.2 90B (Vision) Turbo | 128k | 66 | $1.20 | 55.6 | 0.29 | ||
Llama 3.2 11B (Vision) | 128k | 53 | $0.16 | 132.2 | 0.36 | ||
Llama 3.2 11B (Vision) | 128k | 54 | $0.20 | 121.6 | 0.25 | ||
Llama 3.2 11B (Vision) | 128k | 54 | $0.06 | 53.6 | 0.21 | ||
Llama 3.2 11B (Vision) | 8k | 53 | $0.18 | 750.1 | 0.27 | ||
Llama 3.2 11B (Vision) Turbo | 128k | 54 | $0.18 | 159.7 | 0.24 | ||
Llama 3.1 8B | 33k | 54 | $0.10 | 2,183.6 | 0.26 | ||
Llama 3.1 8B | 128k | 53 | $0.10 | 119.8 | 0.46 | ||
Llama 3.1 8B | 128k | 54 | $0.22 | 90.8 | 0.39 | ||
Llama 3.1 8B Fast | 128k | 54 | $0.04 | 185.2 | 0.50 | ||
Llama 3.1 8B Base | 128k | 54 | $0.03 | 42.9 | 0.64 | ||
Llama 3.1 8B Vertex | 128k | 54 | $0.00 | 119.9 | 0.18 | ||
Llama 3.1 8B | 128k | 54 | $0.38 | 161.5 | 0.31 | ||
Llama 3.1 8B | 128k | 53 | $0.20 | 196.4 | 0.26 | ||
Llama 3.1 8B | 128k | 54 | $0.04 | 55.6 | 0.23 | ||
Llama 3.1 8B | 128k | 54 | $0.10 | 539.9 | 0.41 | ||
Llama 3.1 8B | 128k | 53 | $0.06 | 750.0 | 0.28 | ||
Llama 3.1 8B | 16k | 52 | $0.13 | 1,008.5 | 0.28 | ||
Llama 3.1 8B | 128k | 54 | $0.20 | 155.6 | 0.31 | ||
Llama 3.1 8B Turbo | 128k | 53 | $0.18 | 291.1 | 0.24 | ||
Llama 3.1 8B | 128k | 52 | $0.15 | 463.9 | 0.29 | ||
Llama 3.2 3B | 128k | 48 | $0.10 | 198.6 | 0.44 | ||
Llama 3.2 3B | 128k | 49 | $0.15 | 143.6 | 0.41 | ||
Llama 3.2 3B Base | 128k | 49 | $0.01 | 120.8 | 0.52 | ||
Llama 3.2 3B | 128k | 50 | $0.10 | 272.0 | 0.28 | ||
Llama 3.2 3B | 128k | 49 | $0.02 | 155.0 | 0.17 | ||
Llama 3.2 3B | 8k | 49 | $0.06 | 1,615.1 | 0.36 | ||
Llama 3.2 3B | 4k | 49 | $0.10 | 1,362.1 | 0.23 | ||
Llama 3.2 3B Turbo | 128k | 49 | $0.06 | 45.7 | 0.68 | ||
Llama 3.2 1B | 128k | 26 | $0.10 | 313.6 | 0.35 | ||
Llama 3.2 1B Base | 128k | 26 | $0.01 | 254.7 | 0.50 | ||
Llama 3.2 1B | 128k | 26 | $0.01 | 183.9 | 0.23 | ||
Llama 3.2 1B | 8k | 26 | $0.04 | 3,332.0 | 0.50 | ||
Llama 3.2 1B | 4k | 26 | $0.05 | 2,052.6 | 0.26 | ||
Gemini 2.0 Flash (exp) (AI Studio) | 1m | 82 | $0.00 | 169.0 | 0.47 | ||
Gemini 1.5 Pro (Sep) (Vertex) | 2m | 80 | $2.19 | 58.2 | 0.41 | ||
Gemini 1.5 Pro (Sep) (AI Studio) | 2m | 80 | $2.19 | 63.5 | 0.76 | ||
Gemini 1.5 Flash (Sep) (Vertex) | 1m | 74 | $0.13 | 189.6 | 0.22 | ||
Gemini 1.5 Flash (Sep) (AI Studio) | 1m | 74 | $0.13 | 181.9 | 0.40 | ||
Gemma 2 27B | 8k | 61 | $0.80 | 59.3 | 0.40 | ||
Gemma 2 9B Fast | 8k | 55 | $0.04 | 183.7 | 0.49 | ||
Gemma 2 9B Base | 8k | 55 | $0.03 | 168.9 | 0.52 | ||
Gemma 2 9B | 8k | 54 | $0.04 | 56.0 | 0.32 | ||
Gemma 2 9B | 8k | 55 | $0.20 | 650.5 | 0.24 | ||
Gemma 2 9B | 8k | 55 | $0.30 | 130.1 | 0.28 | ||
Gemini 1.5 Flash-8B AI Studio | 1m | 47 | $0.07 | 279.8 | 0.36 | ||
Gemini 1.5 Pro (May) (Vertex) | 2m | 72 | $2.19 | 65.6 | 0.41 | ||
Gemini 1.5 Pro (May) (AI Studio) | 2m | 72 | $2.19 | 67.1 | 0.76 | ||
Gemini 1.5 Flash (May) (Vertex) | 1m | $0.13 | 301.1 | 0.29 | |||
Gemini 1.5 Flash (May) (AI Studio) | 1m | $0.13 | 311.7 | 0.29 | |||
Gemini Experimental (Nov) (AI Studio) | 2m | $0.00 | 54.6 | 1.27 | |||
Claude 3.5 Sonnet (Oct) | 200k | 80 | $6.00 | 42.3 | 1.05 | ||
Claude 3.5 Sonnet (Oct) Vertex | 200k | 80 | $6.00 | 73.1 | 0.74 | ||
Claude 3.5 Sonnet (Oct) | 200k | 80 | $6.00 | 85.7 | 1.31 | ||
Claude 3.5 Sonnet (June) | 200k | 76 | $6.00 | 45.4 | 1.02 | ||
Claude 3.5 Sonnet (June) Vertex | 200k | 76 | $6.00 | 61.3 | 0.73 | ||
Claude 3.5 Sonnet (June) | 200k | 76 | $6.00 | 86.6 | 0.83 | ||
Claude 3 Opus | 200k | 70 | $30.00 | 23.4 | 1.51 | ||
Claude 3 Opus Vertex | 200k | 70 | $30.00 | 27.4 | 3.18 | ||
Claude 3 Opus | 200k | 70 | $30.00 | 27.5 | 2.11 | ||
Claude 3.5 Haiku Standard | 200k | 68 | $1.60 | 55.9 | 0.77 | ||
Claude 3.5 Haiku Latency Optimized | 200k | 68 | $2.00 | 100.9 | 0.58 | ||
Claude 3.5 Haiku Vertex | 200k | 68 | $1.60 | 64.9 | 0.95 | ||
Claude 3.5 Haiku | 200k | 68 | $1.60 | 64.8 | 0.78 | ||
Claude 3 Haiku | 200k | 55 | $0.50 | 108.9 | 0.78 | ||
Claude 3 Haiku | 200k | 55 | $0.50 | 136.3 | 0.45 | ||
Pixtral Large | 128k | 74 | $3.00 | 40.5 | 0.51 | ||
Mistral Large 2 (Jul '24) | 128k | 74 | $3.00 | 31.3 | 0.61 | ||
Mistral Large 2 (Jul '24) | 128k | 74 | $3.00 | 34.4 | 0.47 | ||
Mistral Large 2 (Jul '24) | 128k | 74 | $3.00 | 28.9 | 0.54 | ||
Mistral Large 2 (Nov '24) | 128k | 74 | $3.00 | 45.7 | 0.50 | ||
Mistral Large 2 (Nov '24) | 128k | 74 | $3.00 | 35.5 | 0.54 | ||
Mistral Small (Sep '24) | 33k | 61 | $0.30 | 63.4 | 0.43 | ||
Mixtral 8x22B | 65k | 62 | $3.00 | 81.2 | 0.42 | ||
Mixtral 8x22B Base | 65k | 60 | $0.60 | 87.8 | 0.60 | ||
Mixtral 8x22B Fast | 65k | 60 | $1.05 | 101.3 | 0.62 | ||
Mixtral 8x22B | 65k | 61 | $1.20 | 73.1 | 0.34 | ||
Mixtral 8x22B | 65k | 55 | $1.20 | 65.1 | 0.42 | ||
Pixtral 12B | 128k | 56 | $0.15 | 68.0 | 0.42 | ||
Pixtral 12B | 128k | 57 | $0.10 | 72.5 | 0.45 | ||
Ministral 8B | 128k | 56 | $0.10 | 138.6 | 0.37 | ||
Mistral NeMo | 128k | 53 | $0.15 | 119.4 | 0.41 | ||
Mistral NeMo Fast | 128k | 53 | $0.12 | 158.8 | 0.51 | ||
Mistral NeMo Base | 128k | 53 | $0.06 | 52.7 | 0.60 | ||
Mistral NeMo | 128k | 54 | $0.06 | 73.1 | 0.23 | ||
Ministral 3B | 128k | 53 | $0.04 | 169.4 | 0.38 | ||
Mixtral 8x7B | 33k | 42 | $0.70 | 99.1 | 0.38 | ||
Mixtral 8x7B | 33k | 41 | $0.51 | 73.5 | 0.35 | ||
Mixtral 8x7B Fast | 33k | 41 | $0.23 | 162.0 | 0.52 | ||
Mixtral 8x7B Base | 33k | 41 | $0.12 | 135.4 | 0.51 | ||
Mixtral 8x7B | 33k | 43 | $0.50 | 141.7 | 0.27 | ||
Mixtral 8x7B | 33k | 40 | $0.24 | 93.9 | 0.21 | ||
Mixtral 8x7B | 33k | 42 | $0.24 | 556.4 | 0.28 | ||
Mixtral 8x7B | 33k | 42 | $0.63 | 90.7 | 0.37 | ||
Mixtral 8x7B | 33k | 35 | $0.60 | 93.5 | 0.25 | ||
Codestral-Mamba | 256k | 33 | $0.25 | 94.2 | 0.56 | ||
Command-R+ | 128k | 55 | $6.00 | 47.4 | 0.52 | ||
Command-R+ | 128k | 55 | $4.38 | 75.2 | 0.24 | ||
Command-R+ (Apr '24) | 128k | 45 | $6.00 | 47.1 | 0.52 | ||
Command-R+ (Apr '24) | 128k | 47 | $6.00 | 74.3 | 0.26 | ||
Command-R+ (Apr '24) | 128k | 44 | $6.00 | 47.7 | 0.59 | ||
Command-R (Mar '24) | 128k | 36 | $0.75 | 108.3 | 0.36 | ||
Command-R (Mar '24) | 128k | 37 | $0.75 | 173.5 | 0.16 | ||
Command-R (Mar '24) | 128k | 36 | $0.75 | 77.3 | 0.46 | ||
Aya Expanse 32B | 128k | $0.75 | 120.9 | 0.17 | |||
Aya Expanse 8B | 8k | $0.75 | 165.9 | 0.15 | |||
Command-R | 128k | $0.75 | 108.4 | 0.36 | |||
Command-R | 128k | 51 | $0.26 | 117.2 | 0.17 | ||
Sonar 3.1 Small | 127k | $0.20 | 203.3 | 0.30 | |||
Sonar 3.1 Large | 127k | $1.00 | 56.0 | 0.30 | |||
Grok Beta | 128k | 72 | $7.50 | 67.0 | 0.36 | ||
Nova Pro | 300k | 75 | $1.40 | 88.9 | 0.37 | ||
Nova Lite | 300k | 70 | $0.10 | 142.9 | 0.32 | ||
Nova Micro | 130k | 66 | $0.06 | 194.2 | 0.32 | ||
Phi-4 | 16k | 77 | $0.09 | 82.4 | 0.22 | ||
Phi-3 Medium 14B | 128k | $0.30 | 36.4 | 0.43 | |||
DBRX | 33k | 50 | $1.13 | 68.3 | 0.47 | ||
DBRX | 33k | 44 | $1.20 | 82.9 | 0.31 | ||
Llama 3.1 Nemotron 70B Base | 128k | 72 | $0.20 | 48.1 | 0.60 | ||
Llama 3.1 Nemotron 70B Fast | 128k | 72 | $0.38 | 69.9 | 0.60 | ||
Llama 3.1 Nemotron 70B | 128k | 72 | $0.27 | 29.6 | 0.31 | ||
Jamba 1.5 Large | 256k | 64 | $3.50 | 52.8 | 0.56 | ||
Jamba 1.5 Large | 256k | 64 | $3.50 | 50.5 | 0.69 | ||
Jamba 1.5 Mini | 256k | 46 | $0.25 | 181.5 | 0.34 | ||
Jamba 1.5 Mini | 256k | $0.25 | 81.4 | 0.49 | |||
DeepSeek V3 | 66k | 79 | $0.48 | 54.6 | 1.06 | ||
DeepSeek V3 (FP8) | 128k | 79 | $0.25 | 12.9 | 1.03 | ||
DeepSeek V3 | 128k | 79 | $0.90 | 21.4 | 1.01 | ||
DeepSeek V3 | 32k | 79 | $1.25 | 10.3 | 0.62 | ||
DeepSeek V3 (FP8) | 128k | 79 | $1.25 | 21.9 | 0.76 | ||
DeepSeek-V2.5 (Dec '24) | 64k | 72 | $0.17 | 56.1 | 1.08 | ||
DeepSeek-Coder-V2 | 128k | 71 | $0.17 | 54.9 | 1.03 | ||
DeepSeek-V2.5 | 128k | $2.00 | 7.8 | 0.77 | |||
Qwen2.5 72B | 131k | 77 | $0.40 | 34.4 | 0.56 | ||
Qwen2.5 72B | 131k | 77 | $0.20 | 45.8 | 0.62 | ||
Qwen2.5 72B Fast | 131k | 77 | $0.38 | 68.3 | 0.55 | ||
Qwen2.5 72B | 131k | 77 | $0.90 | 79.4 | 0.36 | ||
Qwen2.5 72B | 33k | 78 | $0.27 | 34.7 | 0.30 | ||
Qwen2.5 72B | 8k | 77 | $2.50 | 225.5 | 0.62 | ||
Qwen2.5 72B | 131k | 77 | $1.20 | 87.3 | 0.40 | ||
Qwen2.5 Coder 32B | 131k | 72 | $0.20 | 37.4 | 0.47 | ||
Qwen2.5 Coder 32B | 33k | 72 | $0.90 | 95.9 | 0.33 | ||
Qwen2.5 Coder 32B | 33k | 71 | $0.10 | 49.3 | 0.25 | ||
Qwen2.5 Coder 32B | 8k | 72 | $1.88 | 310.4 | 0.33 | ||
Qwen2.5 Coder 32B | 131k | 72 | $0.80 | 82.0 | 0.51 | ||
Qwen2 72B | 33k | 69 | $0.90 | 64.5 | 0.34 | ||
QwQ 32B-Preview | 33k | $0.20 | 35.4 | 0.50 | |||
QwQ 32B-Preview | 33k | $0.90 | 105.1 | 0.35 | |||
QwQ 32B-Preview | 33k | $0.26 | 59.3 | 0.26 | |||
QwQ 32B-Preview | 8k | $0.13 | 299.6 | 0.54 | |||
QwQ 32B-Preview | 33k | $1.20 | 58.9 | 0.52 | |||
Yi-Large | 32k | 61 | $3.00 | 68.6 | 0.43 | ||
GPT-4 Turbo | 128k | 75 | $15.00 | 35.7 | 0.70 | ||
GPT-4 Turbo | 128k | 75 | $15.00 | 40.4 | 1.47 | ||
GPT-4 | 8k | $37.50 | 27.3 | 0.75 | |||
Llama 3 70B | 8k | 47 | $1.18 | 47.0 | 0.36 | ||
Llama 3 70B | 8k | 62 | $0.40 | 32.8 | 0.61 | ||
Llama 3 70B | 8k | 47 | $2.86 | 48.2 | 0.45 | ||
Llama 3 70B | 8k | 46 | $0.90 | 126.3 | 0.33 | ||
Llama 3 70B | 8k | 48 | $0.27 | 42.6 | 0.28 | ||
Llama 3 70B | 8k | 48 | $0.64 | 347.3 | 0.27 | ||
Llama 3 70B (Reference, FP16) | 8k | 48 | $0.90 | 164.9 | 0.37 | ||
Llama 3 70B (Turbo, FP8) | 8k | 48 | $0.88 | 23.9 | 0.36 | ||
Llama 3 8B | 8k | 44 | $0.10 | 65.9 | 0.37 | ||
Llama 3 8B | 8k | 45 | $0.38 | 102.7 | 0.33 | ||
Llama 3 8B | 8k | 45 | $0.38 | 73.6 | 0.38 | ||
Llama 3 8B | 8k | 45 | $0.20 | 187.7 | 0.27 | ||
Llama 3 8B | 8k | 45 | $0.04 | 110.9 | 0.20 | ||
Llama 3 8B | 8k | 45 | $0.06 | 1,201.6 | 0.34 | ||
Llama 3 8B | 8k | 46 | $0.20 | 213.9 | 0.30 | ||
Llama 2 Chat 7B | 4k | $0.10 | 124.0 | 0.36 | |||
Gemini 1.0 Pro (AI Studio) | 33k | $0.75 | 102.7 | 1.23 | |||
Claude 3 Sonnet | 200k | 57 | $6.00 | 51.5 | 0.81 | ||
Claude 3 Sonnet | 200k | 57 | $6.00 | 85.5 | 0.80 | ||
Claude 2.1 | 200k | $12.00 | 29.0 | 1.81 | |||
Claude 2.1 | 200k | $12.00 | 13.3 | 0.81 | |||
Claude 2.0 | 100k | $12.00 | 29.6 | 0.81 | |||
Mistral Small (Feb '24) | 33k | 59 | $1.50 | 61.0 | 0.46 | ||
Mistral Small (Feb '24) | 33k | 59 | $1.50 | 51.9 | 0.39 | ||
Mistral Large (Feb '24) | 33k | 57 | $6.00 | 36.7 | 0.48 | ||
Mistral Large (Feb '24) | 33k | 56 | $6.00 | 43.5 | 0.41 | ||
Mistral Large (Feb '24) | 33k | 55 | $6.00 | 29.5 | 0.52 | ||
Mistral 7B | 8k | 24 | $0.25 | 130.7 | 0.40 | ||
Mistral 7B | 8k | 28 | $0.16 | 93.1 | 0.33 | ||
Mistral 7B | 8k | 28 | $0.04 | 109.4 | 0.20 | ||
Mistral 7B | 8k | 28 | $0.20 | 176.5 | 0.22 | ||
Codestral (May '24) | 33k | $0.30 | 84.2 | 0.44 | |||
Mistral Medium | 33k | $4.09 | 43.8 | 0.48 | |||
OpenChat 3.5 | 8k | 44 | $0.06 | 71.4 | 0.29 | ||
Jamba Instruct | 256k | $0.55 | 183.9 | 0.35 | |||
Jamba Instruct | 256k | 28 | $0.55 | 74.1 | 0.52 |
Key definitions
Artificial Analysis Quality Index: Average result across our evaluations covering different dimensions of model intelligence. Currently includes MMLU, GPQA, Math & HumanEval. OpenAI o1 model figures are preliminary and are based on figures stated by OpenAI. See methodology for more details.
Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Latency: Time to first token of tokens received, in seconds, after API request sent. For models which do not support streaming, this represents time to receive the completion.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Input Price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Time period: Metrics are 'live' and are based on the past 14 days of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.