
Mistral: Models Intelligence, Performance & Price
Analysis of Mistral's models across key metrics including quality, price, output speed, latency, context window & more. This analysis is intended to support you in choosing the best model provided by Mistral for your use-case. For more details including relating to our methodology, see our FAQs. Models analyzed: Pixtral Large, Mistral Large 2 (Nov '24), Mistral Large 2 (Jul '24), Mistral Small 3, Mistral Small (Sep '24), Mixtral 8x22B, Pixtral 12B, Ministral 8B, Mistral NeMo, Ministral 3B, Mixtral 8x7B, Codestral-Mamba, Mistral Saba, Codestral (Jan '25), Mistral Small 3.1, Mistral Small (Feb '24), Mistral Large (Feb '24), Mistral 7B, Codestral (May '24), and Mistral Medium.
Link:
Mistral Model Comparison Summary
Intelligence:
Mistral Large 2 (Nov '24) and
Pixtral Large are the highest quality models offered by Mistral, followed by
Mistral Large 2 (Jul '24),
Mistral Small 3.1 &
Mistral Small 3.Output Speed (tokens/s):
Ministral 3B (220 t/s) and
Codestral (Jan '25) (207 t/s) are the fastest models offered by Mistral, followed by
Mistral Small 3.1,
Mistral Small (Feb '24) &
Mistral NeMo.Latency (seconds):
Mistral 7B (0.33s) and
Codestral (Jan '25) (0.34s) are the lowest latency models offered by Mistral, followed by
Mixtral 8x7B,
Mistral Small (Feb '24) &
Ministral 3B.Blended Price ($/M tokens):
Ministral 3B ($0.04) and
Ministral 8B ($0.10) are the cheapest models offered by Mistral, followed by
Mistral Small 3,
Pixtral 12B &
Mistral NeMo.Context Window Size:
Codestral-Mamba (256k) and
Codestral (Jan '25) (256k) are the largest context window models offered by Mistral, followed by
Pixtral Large,
Mistral Large 2 (Nov '24) &
Mistral Large 2 (Jul '24).

























Highlights
Intelligence
Artificial Analysis Intelligence Index; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better
Parallel Queries:
Prompt Length:
Features | Model Intelligence | Price | Output tokens/s | Latency | |||
---|---|---|---|---|---|---|---|
Further Analysis | |||||||
![]() | ![]() Mistral Large 2 (Nov '24) | 128k | 38 | $3.00 | 26.9 | 0.56 | |
![]() | ![]() Pixtral Large | 128k | 37 | $3.00 | 31.4 | 0.45 | |
![]() | ![]() Mistral Large 2 (Jul '24) | 128k | 37 | $3.00 | 39.1 | 0.57 | |
![]() | ![]() Mistral Small 3.1 | 128k | 35 | $0.15 | 158.7 | 0.40 | |
![]() | ![]() Mistral Small 3 | 32k | 35 | $0.15 | 127.7 | 0.37 | |
![]() | ![]() Codestral (Jan '25) | 256k | 28 | $0.45 | 206.8 | 0.34 | |
![]() | ![]() Mistral Small (Sep '24) | 33k | 27 | $0.30 | 64.6 | 0.42 | |
![]() | ![]() Mistral Large (Feb '24) | 33k | 26 | $6.00 | 31.9 | 0.51 | |
![]() | ![]() Mixtral 8x22B | 65k | 26 | $3.00 | 69.0 | 0.42 | |
![]() | ![]() Pixtral 12B | 128k | 23 | $0.15 | 104.4 | 0.37 | |
![]() | ![]() Mistral Small (Feb '24) | 33k | 23 | $1.50 | 155.2 | 0.35 | |
![]() | ![]() Mistral Medium | 33k | 23 | $4.09 | 43.3 | 0.46 | |
![]() | ![]() Ministral 8B | 128k | 22 | $0.10 | 144.8 | 0.38 | |
![]() | ![]() Codestral (May '24) | 33k | 20 | $0.30 | 107.1 | 0.38 | |
![]() | ![]() Ministral 3B | 128k | 20 | $0.04 | 220.3 | 0.35 | |
![]() | ![]() Mistral NeMo | 128k | 20 | $0.15 | 147.3 | 0.37 | |
![]() | ![]() Mixtral 8x7B | 33k | 17 | $0.70 | 98.3 | 0.35 | |
![]() | ![]() Codestral-Mamba | 256k | 14 | $0.25 | 94.8 | 0.51 | |
![]() | ![]() Mistral 7B | 8k | 10 | $0.25 | 114.0 | 0.33 | |
![]() | ![]() Mistral Saba | 32k | $0.30 | 99.8 | 0.37 |
Key definitions
Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Latency: Time to first token of tokens received, in seconds, after API request sent. For models which do not support streaming, this represents time to receive the completion.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Input Price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Time period: Metrics are 'live' and are based on the past 72 hours of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.