LLM API Providers Leaderboard

Comparison and ranking of API provider performance for AI LLM Models across key metrics including price, performance / speed (throughput & latency), context window & others. For more details including relating to our methodology, see our FAQs.

API providers compared: OpenAI, Mistral, Microsoft Azure, Amazon Bedrock, Groq, Together.ai, Anthropic, Perplexity, Google, Fireworks, Baseten, Cohere, Lepton AI, Speechmatics, Deepinfra, Replicate, NVIDIA NGC (Demo), Runpod, Rev AI, fal.ai, AssemblyAI, Deepgram, Gladia, Stability.ai, Midjourney, Databricks, and OctoAI.

Context
Model Quality
Price
Throughput
Latency
Further
Analysis
OpenAI logo
OpenAI logoGPT-4
8k
90
$37.50
19.6
0.76
Microsoft Azure logo
OpenAI logoGPT-4
8k
90
$37.50
16.3
0.43
OpenAI logo
OpenAI logoGPT-4 Turbo
128k
100
$15.00
21.0
0.79
Microsoft Azure logo
OpenAI logoGPT-4 Turbo
128k
100
$15.00
12.8
0.50
OpenAI logo
OpenAI logoGPT-4 Vision
128k
100
$15.00
32.6
0.55
OpenAI logo
OpenAI logoGPT-3.5 Turbo
16k
67
$0.75
56.2
0.43
Microsoft Azure logo
OpenAI logoGPT-3.5 Turbo
16k
67
$0.75
53.1
0.30
OpenAI logo
OpenAI logoGPT-3.5 Turbo Instruct
4k
60
$1.63
69.3
0.33
Microsoft Azure logo
OpenAI logoGPT-3.5 Turbo Instruct
4k
60
$1.63
145.4
0.60
Replicate logo
Meta logoLlama 3 (70B)
8k
88
$1.18
43.9
2.33
OctoAI logo
Meta logoLlama 3 (70B)
8k
88
$0.93
36.0
0.28
Fireworks logo
Meta logoLlama 3 (70B)
8k
88
$0.90
142.9
0.23
Deepinfra logo
Meta logoLlama 3 (70B)
8k
88
$0.64
32.1
0.70
Groq logo
Meta logoLlama 3 (70B)
8k
88
$0.64
303.3
0.28
Perplexity logo
Meta logoLlama 3 (70B)
8k
88
$1.00
40.0
0.26
Together.ai logo
Meta logoLlama 3 (70B)
8k
88
$0.90
98.2
0.52
Replicate logo
Meta logoLlama 2 Chat (70B)
4k
56
$1.18
56.4
1.95
Replicate logo
Meta logoLlama 2 Chat (13B)
4k
37
$0.20
119.8
1.97
Replicate logo
Meta logoLlama 3 (8B)
8k
58
$0.10
245.0
1.40
Amazon Bedrock logo
Meta logoLlama 2 Chat (70B)
4k
56
$2.10
40.5
0.42
Amazon Bedrock logo
Meta logoLlama 2 Chat (13B)
4k
37
$0.81
52.5
0.30
OctoAI logo
Meta logoLlama 2 Chat (70B)
4k
56
$0.93
24.9
0.28
OctoAI logo
Meta logoLlama 2 Chat (13B)
4k
37
$0.28
50.1
0.26
OctoAI logo
Meta logoLlama 3 (8B)
8k
58
$0.14
102.2
0.32
Microsoft Azure logo
Meta logoLlama 2 Chat (70B)
4k
56
$1.60
16.8
2.75
Microsoft Azure logo
Meta logoLlama 2 Chat (13B)
4k
37
$0.84
42.6
1.21
Fireworks logo
Meta logoLlama 2 Chat (70B)
4k
56
$0.90
73.0
0.31
Fireworks logo
Meta logoLlama 2 Chat (13B)
4k
37
$0.20
140.4
0.26
Fireworks logo
Meta logoLlama 3 (8B)
8k
58
$0.20
303.7
0.25
Deepinfra logo
Meta logoLlama 2 Chat (70B)
4k
56
$0.76
74.1
0.56
Deepinfra logo
Meta logoLlama 2 Chat (13B)
4k
37
$0.35
41.5
0.64
Deepinfra logo
Meta logoLlama 3 (8B)
8k
58
$0.10
114.8
0.68
Groq logo
Meta logoLlama 2 Chat (70B)
4k
56
$0.68
253.5
0.37
Groq logo
Meta logoLlama 3 (8B)
8k
58
$0.06
893.9
0.31
Perplexity logo
Meta logoLlama 2 Chat (70B)
4k
56
$1.00
Perplexity logo
Meta logoLlama 3 (8B)
8k
58
$0.20
121.5
0.21
Together.ai logo
Meta logoLlama 2 Chat (70B)
4k
56
$0.90
42.7
0.51
Together.ai logo
Meta logoLlama 2 Chat (13B)
4k
37
$0.23
48.8
0.30
Together.ai logo
Meta logoLlama 3 (8B)
8k
58
$0.20
259.7
0.29
Replicate logo
Meta logoLlama 2 Chat (7B)
4k
27
$0.10
234.5
1.62
Microsoft Azure logo
Meta logoLlama 2 Chat (7B)
4k
27
$0.56
69.2
0.84
Fireworks logo
Meta logoLlama 2 Chat (7B)
4k
27
$0.20
189.1
0.26
Deepinfra logo
Meta logoLlama 2 Chat (7B)
4k
27
$0.20
25.2
0.68
Together.ai logo
Meta logoLlama 2 Chat (7B)
4k
27
$0.20
89.8
0.29
Fireworks logo
Meta logoCode Llama (70B)
4k
58
$0.90
31.2
0.30
Deepinfra logo
Meta logoCode Llama (70B)
4k
58
$0.75
31.6
0.64
Perplexity logo
Meta logoCode Llama (70B)
16k
58
$1.00
51.9
0.24
Together.ai logo
Meta logoCode Llama (70B)
4k
58
$0.90
29.0
0.35
Mistral logo
Mistral logoMistral Large
33k
84
$12.00
24.9
0.20
Amazon Bedrock logo
Mistral logoMistral Large
33k
84
$12.00
31.1
0.36
Mistral logo
Mistral logoMistral Medium
33k
76
$4.05
21.2
0.21
Mistral logo
Mistral logoMixtral 8x22B
65k
83
$3.00
75.9
0.21
OctoAI logo
Mistral logoMixtral 8x22B
65k
83
$1.20
42.8
0.26
Fireworks logo
Mistral logoMixtral 8x22B
65k
83
$1.20
78.7
0.24
Deepinfra logo
Mistral logoMixtral 8x22B
65k
83
$0.65
46.6
0.68
Perplexity logo
Mistral logoMixtral 8x22B
16k
83
$1.00
61.0
0.23
Together.ai logo
Mistral logoMixtral 8x22B
65k
83
$1.20
45.6
0.82
Mistral logo
Mistral logoMixtral 8x7B
33k
68
$0.70
91.4
0.21
Replicate logo
Mistral logoMixtral 8x7B
33k
68
$0.47
143.9
1.48
Amazon Bedrock logo
Mistral logoMixtral 8x7B
33k
68
$0.51
68.8
0.31
OctoAI logo
Mistral logoMixtral 8x7B
33k
68
$0.35
47.0
0.38
Lepton AI logo
Mistral logoMixtral 8x7B
33k
68
$0.50
97.7
0.30
Fireworks logo
Mistral logoMixtral 8x7B
33k
68
$0.50
249.7
0.21
Deepinfra logo
Mistral logoMixtral 8x7B
33k
68
$0.27
57.9
0.66
Groq logo
Mistral logoMixtral 8x7B
33k
68
$0.27
475.0
0.26
Perplexity logo
Mistral logoMixtral 8x7B
16k
68
$0.60
117.8
0.21
Together.ai logo
Mistral logoMixtral 8x7B
33k
68
$0.60
115.9
0.40
Mistral logo
Mistral logoMistral Small
33k
73
$3.00
55.7
0.21
Mistral logo
Mistral logoMistral 7B
33k
40
$0.25
64.0
0.20
Replicate logo
Mistral logoMistral 7B
33k
40
$0.10
101.1
1.65
Amazon Bedrock logo
Mistral logoMistral 7B
33k
40
$0.16
94.7
0.28
OctoAI logo
Mistral logoMistral 7B
33k
40
$0.14
75.3
0.31
Fireworks logo
Mistral logoMistral 7B
33k
40
$0.20
242.7
0.18
Deepinfra logo
Mistral logoMistral 7B
33k
40
$0.13
59.2
0.58
Perplexity logo
Mistral logoMistral 7B
16k
40
$0.20
96.6
0.21
Together.ai logo
Mistral logoMistral 7B
8k
40
$0.20
77.2
0.43
Baseten logo
Mistral logoMistral 7B
4k
40
$0.20
230.9
0.11
Google logo
Google logoGemini 1.5 Pro
1000k
88
$10.50
43.3
1.29
Google logo
Google logoGemini 1.0 Pro
33k
66
$0.75
78.8
1.45
Fireworks logo
Google logoGemma 7B
8k
59
$0.20
207.9
0.25
Deepinfra logo
Google logoGemma 7B
8k
59
$0.13
61.3
0.62
Groq logo
Google logoGemma 7B
8k
59
$0.10
918.8
0.29
Together.ai logo
Google logoGemma 7B
8k
59
$0.20
116.4
0.31
Amazon Bedrock logo
Anthropic logoClaude 3 Opus
200k
100
$30.00
26.1
0.93
Anthropic logo
Anthropic logoClaude 3 Opus
200k
100
$30.00
25.9
1.31
Amazon Bedrock logo
Anthropic logoClaude 3 Sonnet
200k
85
$6.00
65.4
0.58
Google logo
Anthropic logoClaude 3 Sonnet
200k
85
$6.00
Anthropic logo
Anthropic logoClaude 3 Sonnet
200k
85
$6.00
62.5
0.67
Amazon Bedrock logo
Anthropic logoClaude 3 Haiku
200k
78
$0.50
85.3
0.51
Google logo
Anthropic logoClaude 3 Haiku
200k
78
$0.50
Anthropic logo
Anthropic logoClaude 3 Haiku
200k
78
$0.50
98.1
0.34
Amazon Bedrock logo
Anthropic logoClaude 2.1
200k
66
$12.00
41.1
0.52
Anthropic logo
Anthropic logoClaude 2.1
200k
66
$12.00
42.2
0.43
Anthropic logo
Anthropic logoClaude 2.0
100k
72
$12.00
39.4
0.46
Amazon Bedrock logo
Anthropic logoClaude Instant
100k
65
$1.20
84.2
0.34
Anthropic logo
Anthropic logoClaude Instant
100k
65
$1.20
90.5
0.48
Cohere logo
Cohere logoCommand-R+
128k
80
$6.00
40.2
0.16
Cohere logo
Cohere logoCommand-R
128k
67
$0.75
110.9
0.16
Amazon Bedrock logo
Cohere logoCommand
4k
$1.63
28.4
0.32
Cohere logo
Cohere logoCommand
4k
$1.25
28.7
0.36
Amazon Bedrock logo
Cohere logoCommand Light
4k
$0.38
47.7
0.31
Cohere logo
Cohere logoCommand Light
4k
$0.38
81.4
0.16
Lepton AI logo
Databricks logoDBRX
33k
76
$0.90
148.5
0.44
Fireworks logo
Databricks logoDBRX
33k
76
$1.60
25.3
0.44
Databricks logo
Databricks logoDBRX
33k
76
$3.38
118.4
0.57
Together.ai logo
Databricks logoDBRX
33k
76
$1.20
79.7
0.43
Deepinfra logo
OpenChat logoOpenChat 3.5
8k
56
$0.13
51.3
0.61
Together.ai logo
OpenChat logoOpenChat 3.5
8k
56
$0.20
90.8
0.70
Perplexity logo
Perplexity logoPPLX-70B Online
4k
45
$1.00
38.4
1.19
Perplexity logo
Perplexity logoPPLX-7B-Online
4k
35
$0.20
95.3
0.94

Key definitions

Quality: Index represents normalized average relative performance across Chatbot arena, MMLU & MT-Bench.
Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Throughput: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).
Latency: Time to first token of tokens received, in seconds, after API request sent.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Input price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Time period: Metrics are 'live' and are based on the past 14 days of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.