Independent analysis of AI models and API providers
Understand the AI landscape to choose the best model and provider for your use-case
Highlights
Quality
Artificial Analysis Quality Index; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better
API Provider Highlights: Llama 3.1 Instruct 70B
Output Speed vs. Price: Llama 3.1 Instruct 70B
Output Speed: Output Tokens per Second, Price: USD per 1M Tokens
Most attractive quadrant
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Median: Figures represent median (P50) measurement over the past 14 days or otherwise to reflect sustained changes in performance.
Notes: Llama 3.1 70B, Cerebras: 8k context, Llama 3.1 70B, SambaNova: 8k context
Pricing (Input and Output Prices): Llama 3.1 Instruct 70B
Price: USD per 1M Tokens; Lower is better
Input price
Output price
Input price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Output price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Notes: Llama 3.1 70B, Cerebras: 8k context, Llama 3.1 70B, SambaNova: 8k context
Output Speed: Llama 3.1 Instruct 70B
Output Speed: Output Tokens per Second
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Median across providers: Figures represent median (P50) across all providers which support the model.
Notes: Llama 3.1 70B, Cerebras: 8k context, Llama 3.1 70B, SambaNova: 8k context
Output Speed, Over Time: Llama 3.1 Instruct 70B
Output Tokens per Second; Higher is better
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Over time measurement: Median measurement per day, based on 8 measurements each day at different times. Labels represent start of week's measurements.
Notes: Llama 3.1 70B, Cerebras: 8k context, Llama 3.1 70B, SambaNova: 8k context
See more information on any of our supported models