Independent analysis of AI language models and API providers
Understand the AI landscape and choose the best model and API provider for your use-case
Highlights
Quality
Quality Index; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better
Quality Comparison by Ability
+ Add model from specific provider
Varied metrics by ability categorization; Higher is better
General Ability (Chatbot Arena)
Reasoning & Knowledge (MMLU)
Coding (HumanEval)
Median across providers: Figures represent median (P50) across all providers which support the model.
Quality vs. Output speed
+ Add model from specific provider
Quality: General reasoning index, Output Speed: Output Tokens per Second, Price: USD per 1M Tokens
Most attractive quadrant
Size represents Price (USD per M Tokens)
Quality: Index represents normalized average relative performance across Chatbot arena, MMLU & MT-Bench.
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Median across providers: Figures represent median (P50) across all providers which support the model.
Quality vs. Price
+ Add model from specific provider
Quality: General reasoning index, Price: USD per 1M Tokens
Most attractive quadrant
Quality: Index represents normalized average relative performance across Chatbot arena, MMLU & MT-Bench.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Median across providers: Figures represent median (P50) across all providers which support the model.
Output Speed
+ Add model from specific provider
Output Tokens per Second; Higher is better
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).
Median across providers: Figures represent median (P50) across all providers which support the model.
Pricing: Input and Output Prices
+ Add model from specific provider
USD per 1M Tokens
Input price
Output price
Input price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Output price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Median across providers: Figures represent median (P50) across all providers which support the model.
API Provider Highlights: Llama 3 Instruct 70B
Output Speed vs. Price: Llama 3 Instruct 70B
Output Speed: Output Tokens per Second, Price: USD per 1M Tokens
Most attractive quadrant
Microsoft Azure
Groq
Together.ai
Perplexity
Fireworks
Lepton AI
Deepinfra
Replicate
Databricks
OctoAI
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).
Median: Figures represent median (P50) measurement over the past 14 days.
Variance data is present on the model and API provider pages amongst the detailed performance metrics. See 'Compare Models' and 'Compare API Providers' in the navigation menu for further analysis.
Pricing (Input and Output Prices): Llama 3 Instruct 70B
Price: USD per 1M Tokens; Lower is better
Input price
Output price
Input price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Output price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Output Speed, Over Time: Llama 3 Instruct 70B
Output Tokens per Second; Higher is better
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).
Over time measurement: Median measurement per day, based on 8 measurements each day at different times. Labels represent start of week's measurements.
See more information on any of our supported models