Menu

logo
Artificial Analysis
HOME

Independent analysis of AI models and API providers

Understand the AI landscape to choose the best model and provider for your use-case

Highlights

Quality
Artificial Analysis Quality Index; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better

API Provider Highlights: Llama 3.3 Instruct 70B

Output Speed vs. Price: Llama 3.3 Instruct 70B

Output Speed: Output Tokens per Second, Price: USD per 1M Tokens
Most attractive quadrant
Smaller, emerging providers are offering high output speed and at competitive prices.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Median: Figures represent median (P50) measurement over the past 14 days or otherwise to reflect sustained changes in performance.
Notes: Llama 3.3 70B, Cerebras: 33k context

Pricing (Input and Output Prices): Llama 3.3 Instruct 70B

Price: USD per 1M Tokens; Lower is better
Input price
Output price
The relative importance of input vs. output token prices varies by use-case. E.g. Generation tasks are typically more input token weighted while document-focused tasks (e.g. RAG) are more output token weighted.
Input price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Output price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Notes: Llama 3.3 70B, Cerebras: 33k context

Output Speed: Llama 3.3 Instruct 70B

Output Speed: Output Tokens per Second
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Median across providers: Figures represent median (P50) across all providers which support the model.
Notes: Llama 3.3 70B, Cerebras: 33k context

Output Speed, Over Time: Llama 3.3 Instruct 70B

Output Tokens per Second; Higher is better
Smaller, emerging providers offer high output speed, though precise speeds delivered vary day-to-day.
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Over time measurement: Median measurement per day, based on 8 measurements each day at different times. Labels represent start of week's measurements.
Notes: Llama 3.3 70B, Cerebras: 33k context
See more information on any of our supported models
Model NameFurther analysis
OpenAI logo
OpenAI logoo1-preview
OpenAI logoo1-mini
OpenAI logoGPT-4o (Aug '24)
OpenAI logoGPT-4o (May '24)
OpenAI logoGPT-4o (Nov '24)
OpenAI logoGPT-4o mini
OpenAI logoGPT-4 Turbo
OpenAI logoGPT-3.5 Turbo
OpenAI logoGPT-4
Meta logo
Meta logoLlama 3.3 Instruct 70B
Meta logoLlama 3.1 Instruct 405B
Meta logoLlama 3.1 Instruct 70B
Meta logoLlama 3.2 Instruct 90B (Vision)
Meta logoLlama 3.2 Instruct 11B (Vision)
Meta logoLlama 3.1 Instruct 8B
Meta logoLlama 3.2 Instruct 3B
Meta logoLlama 3.2 Instruct 1B
Meta logoLlama 3 Instruct 70B
Meta logoLlama 3 Instruct 8B
Meta logoLlama 2 Chat 13B
Meta logoLlama 2 Chat 7B
Google logo
Google logoGemini 1.5 Pro (Sep '24)
Google logoGemini 1.5 Flash (Sep '24)
Google logoGemma 2 27B
Google logoGemma 2 9B
Google logoGemini Experimental (Dec '24)
Google logoGemini 1.5 Pro (May '24)
Google logoGemini 1.5 Flash (May '24)
Google logoGemini 1.5 Flash-8B
Google logoGemini 1.0 Pro
Anthropic logo
Anthropic logoClaude 3.5 Sonnet (Oct '24)
Anthropic logoClaude 3.5 Sonnet (June '24)
Anthropic logoClaude 3 Opus
Anthropic logoClaude 3.5 Haiku
Anthropic logoClaude 3 Haiku
Anthropic logoClaude 3 Sonnet
Mistral logo
Mistral logoPixtral Large
Mistral logoMistral Large 2 (Jul '24)
Mistral logoMistral Large 2 (Nov '24)
Mistral logoMistral Small (Sep '24)
Mistral logoMixtral 8x22B Instruct
Mistral logoPixtral 12B (2409)
Mistral logoMinistral 8B
Mistral logoMistral NeMo
Mistral logoMinistral 3B
Mistral logoMixtral 8x7B Instruct
Mistral logoCodestral-Mamba
Mistral logoMistral Small (Feb '24)
Mistral logoMistral Large (Feb '24)
Mistral logoMistral 7B Instruct
Mistral logoCodestral
Mistral logoMistral Medium
Cohere logo
Cohere logoCommand-R+ (Aug '24)
Cohere logoCommand-R+ (Apr '24)
Cohere logoCommand-R (Mar '24)
Cohere logoCommand-R (Aug '24)
Cohere logoAya Expanse 32B
Cohere logoAya Expanse 8B
Perplexity logo
Perplexity logoSonar 3.1 Large
Perplexity logoSonar 3.1 Small
xAI logo
xAI logoGrok Beta
Amazon logo
Amazon logoNova Pro
Amazon logoNova Lite
Amazon logoNova Micro
Microsoft Azure logo
Microsoft Azure logoPhi-3 Medium Instruct 14B
Upstage logo
Upstage logoSolar Mini
Databricks logo
Databricks logoDBRX Instruct
NVIDIA logo
NVIDIA logoLlama 3.1 Nemotron Instruct 70B
Reka AI logo
Reka AI logoReka Flash (Sep '24)
Reka AI logoReka Core
Reka AI logoReka Flash (Feb '24)
Reka AI logoReka Edge
AI21 Labs logo
AI21 Labs logoJamba 1.5 Large
AI21 Labs logoJamba 1.5 Mini
AI21 Labs logoJamba Instruct
DeepSeek logo
DeepSeek logoDeepSeek-Coder-V2
DeepSeek logoDeepSeek-V2-Chat
DeepSeek logoDeepSeek-V2.5
Alibaba logo
Alibaba logoQwen2.5 Instruct 72B
Alibaba logoQwen2.5 Coder Instruct 32B
Alibaba logoQwen2 Instruct 72B
01.AI logo
01.AI logoYi-Large
OpenChat logo
OpenChat logoOpenChat 3.5 (1210)