Menu

logo
Artificial Analysis
HOME
logo

DeepSeek: Models Quality, Performance & Price

Analysis of DeepSeek's models across key metrics including quality, price, output speed, latency, context window & more. This analysis is intended to support you in choosing the best model provided by DeepSeek for your use-case. For more details including relating to our methodology, see our FAQs. Models analyzed: DeepSeek R1 and DeepSeek V3.
Link:

DeepSeek Model Comparison Summary

Quality:DeepSeek R1 logo DeepSeek R1 is the highest quality model offered by DeepSeek, followed by DeepSeek V3 logo DeepSeek V3.Output Speed (tokens/s):DeepSeek R1 logo DeepSeek R1 (44 t/s) is the fastest model offered by DeepSeek, followed by DeepSeek V3 logo DeepSeek V3 (35 t/s).Latency (seconds):DeepSeek V3 logo DeepSeek V3 (3.83s) is the lowest latency model offered by DeepSeek, followed by DeepSeek R1 logo DeepSeek R1 (60.52s).Blended Price ($/M tokens):DeepSeek V3 logo DeepSeek V3 ($0.48) is the cheapest model offered by DeepSeek, followed by DeepSeek R1 logo DeepSeek R1 ($0.96).Context Window Size:DeepSeek V3 logo DeepSeek V3 (66k) supports the largest context window, followed by DeepSeek R1 logo DeepSeek R1 (64k).

Highlights

Quality
Artificial Analysis Quality Index; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better
Parallel Queries:
Prompt Length:
Features
Model Quality
Price
Output tokens/s
Latency
Further
Analysis
DeepSeek logo
DeepSeek logo
DeepSeek R1
64k
89
$0.96
44.2
60.52
DeepSeek logo
DeepSeek logo
DeepSeek V3
66k
80
$0.48
34.6
3.83

Key definitions

Artificial Analysis Quality Index: Average result across our evaluations covering different dimensions of model intelligence. Currently includes MMLU, GPQA, Math & HumanEval. OpenAI o1 model figures are preliminary and are based on figures stated by OpenAI. See methodology for more details.
Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Latency: Time to first token of tokens received, in seconds, after API request sent. For models which do not support streaming, this represents time to receive the completion.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Input Price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Time period: Metrics are 'live' and are based on the past 14 days of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.