OpenAI: Models Quality, Performance & Price
Analysis of OpenAI's models across key metrics including quality, price, output speed, latency, context window & more. This analysis is intended to support you in choosing the best model provided by OpenAI for your use-case. For more details including relating to our methodology, see our FAQs. Models analyzed: o3-mini, o1-preview, o1-mini, GPT-4o (Aug '24), GPT-4o (May '24), GPT-4o (Nov '24), GPT-4o mini, GPT-4 Turbo, and GPT-4.
Link:
OpenAI Model Comparison Summary
Quality:
o1-previewĀ andĀ
o1-miniĀ are the highest quality models offered by OpenAI, followed by
GPT-4o (Aug '24),
GPT-4o (May '24) &
GPT-4 Turbo.Output Speed (tokens/s):
o3-mini (199 t/s)Ā andĀ
o1-mini (198 t/s)Ā are the fastest models offered by OpenAI, followed by
o1-preview,
GPT-4o (Nov '24) &
GPT-4o mini.Latency (seconds):
GPT-4o (Nov '24) (0.48s)Ā and Ā
GPT-4o (Aug '24) (0.49s)Ā are the lowest latency models offered by OpenAI, followed by
GPT-4o mini,
GPT-4o (May '24) &
GPT-4 Turbo.Blended Price ($/M tokens):
GPT-4o mini ($0.26)Ā andĀ
o3-mini ($1.93)Ā are the cheapest models offered by OpenAI, followed by
GPT-4o (Aug '24),
GPT-4o (Nov '24) &
o1-mini.Context Window Size:
o3-mini (200k)Ā andĀ
o1-preview (128k)Ā are the largest context window models offered by OpenAI, followed by
o1-mini,
GPT-4o (Aug '24) &
GPT-4o (May '24).
Highlights
Quality
Artificial Analysis Quality Index; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better
Parallel Queries:
Prompt Length:
Features | Model Quality | Price | Output tokens/s | Latency | |||
---|---|---|---|---|---|---|---|
Further Analysis | |||||||
o3-mini | 200k | 89 | $1.93 | 199.0 | 10.85 | ||
o1-preview | 128k | 85 | $26.25 | 147.1 | 19.91 | ||
o1-mini | 128k | 82 | $5.25 | 197.8 | 11.32 | ||
GPT-4o (Aug '24) | 128k | 78 | $4.38 | 61.8 | 0.49 | ||
GPT-4o (May '24) | 128k | 78 | $7.50 | 61.0 | 0.54 | ||
GPT-4o (Nov '24) | 128k | 73 | $4.38 | 83.8 | 0.48 | ||
GPT-4o mini | 128k | 73 | $0.26 | 74.8 | 0.53 | ||
GPT-4 Turbo | 128k | 75 | $15.00 | 36.0 | 0.66 | ||
GPT-4 | 8k | $37.50 | 25.5 | 0.76 |
Key definitions
Artificial Analysis Quality Index: Average result across our evaluations covering different dimensions of model intelligence. Currently includes MMLU, GPQA, Math & HumanEval. OpenAI o1 model figures are preliminary and are based on figures stated by OpenAI. See methodology for more details.
Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Latency: Time to first token of tokens received, in seconds, after API request sent. For models which do not support streaming, this represents time to receive the completion.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Input Price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Time period: Metrics are 'live' and are based on the past 14 days of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.