
Perplexity: Models Intelligence, Performance & Price
Analysis of Perplexity's models across key metrics including quality, price, output speed, latency, context window & more. This analysis is intended to support you in choosing the best model provided by Perplexity for your use-case. For more details including relating to our methodology, see our FAQs. Models analyzed: Sonar Reasoning, Sonar, and Sonar Pro.
Link:
Perplexity Model Comparison Summary
Intelligence:
Sonar and
Sonar Pro are the highest quality models offered by Perplexity, followed by
Sonar Reasoning.Output Speed (tokens/s):
Sonar Reasoning (91 t/s) and
Sonar Pro (81 t/s) are the fastest models offered by Perplexity, followed by
Sonar.Latency (seconds):
Sonar (1.79s) and
Sonar Reasoning (2.20s) are the lowest latency models offered by Perplexity, followed by
Sonar Pro.Blended Price ($/M tokens):
Sonar ($1.00) and
Sonar Reasoning ($2.00) are the cheapest models offered by Perplexity, followed by
Sonar Pro.Context Window Size:
Sonar Pro (200k) and
Sonar Reasoning (127k) are the largest context window models offered by Perplexity, followed by
Sonar.















Highlights
Intelligence
Artificial Analysis Intelligence Index; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better
Parallel Queries:
Prompt Length:
Key definitions
Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Latency: Time to first token of tokens received, in seconds, after API request sent. For models which do not support streaming, this represents time to receive the completion.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Input Price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Time period: Metrics are 'live' and are based on the past 72 hours of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.