Model & API Providers Analysis | Artificial Analysis

Independent analysis of AI language models and API providers

Understand the AI landscape and choose the best model and API provider for your use-case

AI Builders Survey

Participate & receive our report of results

Highlights

Quality

Quality Index; Higher is better

Speed

Output Tokens per Second; Higher is better

Price

USD per 1M Tokens; Lower is better

Navigation

Language Models Comparison Highlights

Quality Comparison by Ability

+ Add model from specific provider

Varied metrics by ability categorization; Higher is better

General Ability (Chatbot Arena)

Reasoning & Knowledge (MMLU)

Coding (HumanEval)

Different use-cases warrant considering different evaluation tests. Chatbot Arena is a good evaluation of communication abilities while MMLU tests reasoning and knowledge more comprehensively.

Median across providers: Figures represent median (P50) across all providers which support the model.

Quality vs. Output speed

+ Add model from specific provider

Quality: General reasoning index, Output Speed: Output Tokens per Second, Price: USD per 1M Tokens

Most attractive quadrant

Size represents Price (USD per M Tokens)

There is a trade-off between model quality and output speed, with higher quality models typically having lower output speed.

Quality: Index represents normalized average relative performance across Chatbot arena, MMLU & MT-Bench.

Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).

Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).

Median across providers: Figures represent median (P50) across all providers which support the model.

Quality vs. Price

+ Add model from specific provider

Quality: General reasoning index, Price: USD per 1M Tokens

Most attractive quadrant

While higher quality models are typically more expensive, they do not all follow the same price-quality curve.

Quality: Index represents normalized average relative performance across Chatbot arena, MMLU & MT-Bench.

Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).

Median across providers: Figures represent median (P50) across all providers which support the model.

Output Speed

+ Add model from specific provider

Output Tokens per Second; Higher is better

Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).

Median across providers: Figures represent median (P50) across all providers which support the model.

Pricing: Input and Output Prices

+ Add model from specific provider

USD per 1M Tokens

Input price

Output price

Prices vary considerably, including between input and output token price. Prices can vary by orders of magnitude (>10X) between the more expensive and cheapest models.

Input price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.

Output price: Price per token generated by the model (received from the API), represented as USD per million Tokens.

Median across providers: Figures represent median (P50) across all providers which support the model.

API Provider Highlights: Llama 3 Instruct 70B

Output Speed vs. Price: Llama 3 Instruct 70B

Output Speed: Output Tokens per Second, Price: USD per 1M Tokens

Most attractive quadrant

Microsoft Azure

Groq

Together.ai

Perplexity

Fireworks

Lepton AI

Deepinfra

Replicate

Databricks

OctoAI

Smaller, emerging providers are offering high output speed and at competitive prices.

Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).

Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).

Median: Figures represent median (P50) measurement over the past 14 days.

Variance data is present on the model and API provider pages amongst the detailed performance metrics. See 'Compare Models' and 'Compare API Providers' in the navigation menu for further analysis.

Pricing (Input and Output Prices): Llama 3 Instruct 70B

Price: USD per 1M Tokens; Lower is better

Input price

Output price

Providers typically charge different prices for input and output tokens. The ratio of input / output token price for a certain use-case may significantly impact overall costs.

Input price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.

Output price: Price per token generated by the model (received from the API), represented as USD per million Tokens.

Output Speed, Over Time: Llama 3 Instruct 70B

Output Tokens per Second; Higher is better

Smaller, emerging providers offer high output speed, though precise speeds delivered vary day-to-day.

Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).

Over time measurement: Median measurement per day, based on 8 measurements each day at different times. Labels represent start of week's measurements.

See more information on any of our supported models

Model Name	Creator	License	Context Window

GPT-4o	OpenAI	Proprietary	128k
GPT-4 Turbo	OpenAI	Proprietary	128k
GPT-4o mini	OpenAI	Proprietary	128k
GPT-4	OpenAI	Proprietary	8k
GPT-3.5 Turbo Instruct	OpenAI	Proprietary	4k
GPT-3.5 Turbo	OpenAI	Proprietary	16k

Gemini 1.5 Pro	Google	Proprietary	2m
Gemini 1.5 Flash	Google	Proprietary	1m
Gemma 2 27B	Google	Open	8k
Gemma 2 9B	Google	Open	8k
Gemini 1.0 Pro	Google	Proprietary	33k
Gemma 7B Instruct	Google	Open	8k

Llama 3.1 Instruct 405B	Meta	Open	128k
Llama 3.1 Instruct 70B	Meta	Open	128k
Llama 3 Instruct 70B	Meta	Open	8k
Llama 3.1 Instruct 8B	Meta	Open	128k
Llama 3 Instruct 8B	Meta	Open	8k
Llama 2 Chat 70B	Meta	Open	4k
Llama 2 Chat 13B	Meta	Open	4k
Llama 2 Chat 7B	Meta	Open	4k

Mistral Large 2	Mistral	Open	128k
Codestral	Mistral	Open	33k
Codestral-Mamba	Mistral	Open	256k
Mistral Large	Mistral	Proprietary	33k
Mixtral 8x22B Instruct	Mistral	Open	65k
Mistral Small	Mistral	Proprietary	33k
Mistral Medium	Mistral	Proprietary	33k
Mistral NeMo	Mistral	Open	128k
Mixtral 8x7B Instruct	Mistral	Open	33k
Mistral 7B Instruct	Mistral	Open	33k

Claude 3.5 Sonnet	Anthropic	Proprietary	200k
Claude 3 Opus	Anthropic	Proprietary	200k
Claude 3 Sonnet	Anthropic	Proprietary	200k
Claude 3 Haiku	Anthropic	Proprietary	200k
Claude 2.0	Anthropic	Proprietary	100k
Claude Instant	Anthropic	Proprietary	100k
Claude 2.1	Anthropic	Proprietary	200k

Command Light	Cohere	Proprietary	4k
Command	Cohere	Proprietary	4k
Command-R+	Cohere	Open	128k
Command-R	Cohere	Open	128k

Sonar Large	Perplexity	Proprietary	33k
Sonar Small	Perplexity	Proprietary	33k

OpenChat 3.5 (1210)	OpenChat	Open	8k

Phi-3 Medium Instruct 14B	Microsoft Azure	Open	128k

DBRX Instruct	Databricks	Open	33k

Reka Core	Reka AI	Proprietary	128k
Reka Flash	Reka AI	Proprietary	128k
Reka Edge	Reka AI	Proprietary	64k

Jamba Instruct	AI21 Labs	Open	256k

DeepSeek-Coder-V2	DeepSeek	Open	128k
DeepSeek-V2-Chat	DeepSeek	Open	128k

Arctic Instruct	Snowflake	Open	4k

Qwen2 Instruct 72B	Alibaba	Open	128k

Yi-Large	01.AI	Proprietary	32k