Mistral Small 3.1 vs. Llama 3.1 Instruct 405B
Comparison between Mistral Small 3.1 and Llama 3.1 Instruct 405B across intelligence, price, speed, context window and more.
For details relating to our methodology, see our Methodology page.
Highlights
Model Comparison
Metric | ![]() | Analysis | |
---|---|---|---|
Creator | |||
Context Window | 128k tokens (~192 A4 pages of size 12 Arial font) | 128k tokens (~192 A4 pages of size 12 Arial font) | Both Mistral Small 3.1 and Llama 3.1 Instruct 405B have the same sized context window |
Release Date | March, 2025 | July, 2024 | Mistral Small 3.1 has a more recent release date than Llama 3.1 Instruct 405B |
Parameters | 24B | 405B | Mistral Small 3.1 is smaller than Llama 3.1 Instruct 405B |
Image Input Support | Yes | No | Mistral Small 3.1 has image input support while Llama 3.1 Instruct 405B does not |
Open Source (Weights) | Both Mistral Small 3.1 and Llama 3.1 Instruct 405B are open source | ||
License | |||
License Supports Commercial Use Without Restrictions | Yes | Yes | Both Mistral Small 3.1 and Llama 3.1 Instruct 405B have license supports commercial use without restrictions |
Intelligence
Artificial Analysis Intelligence Index
Artificial Analysis Intelligence Index by Model Type
Artificial Analysis Intelligence Index by Open Weights vs Proprietary
Artificial Analysis Coding Index
Artificial Analysis Math Index
Intelligence Evaluations
Intelligence vs. Price
Intelligence vs. Output Speed
Intelligence vs. Total Response Time
Context Window
Context Window
Intelligence vs. Context Window
Pricing
Pricing: Input and Output Prices
Pricing: Cached Input Prompts
Pricing: Image Input Pricing
Performance Summary
Output Speed vs. Price
Latency vs. Output Speed
Speed
Measured by Output Speed (tokens per second)
Output Speed
Output Speed by Input Token Count (Context Length)
Output Speed Variance

Output Speed, Over Time
Latency
Measured by Time (seconds) to First Token
Latency
Latency by Input Token Count (Context Length)
Latency Variance

Total Response Time
Time to receive 100 tokens output, calculated from latency and output speed metrics
Total Response Time
Total Response Time by Input Token Count (Context Length)
Total Response Time Variance

Comparisons to Mistral Small 3.1
o1
GPT-4o (Nov '24)
GPT-4o mini
o3-mini (high)
o1-pro
Llama 3.3 Instruct 70B
Llama 3.1 Instruct 8B
Gemini 2.0 Flash (Feb '25)
Gemma 3 27B Instruct
Gemini 2.5 Pro Experimental (Mar' 25)
Claude 3.7 Sonnet (Standard)
Claude 3.7 Sonnet (Extended Thinking)
Mistral Large 2 (Nov '24)
DeepSeek R1
DeepSeek V3 0324 (Mar' 25)
Grok 3 Reasoning Beta
Grok 3
Nova Pro
Nova Micro
Command A
QwQ 32B
DeepSeek V3
Comparisons to Llama 3.1 405B
o1
GPT-4o (Nov '24)
GPT-4o mini
o3-mini (high)
o1-pro
Llama 3.3 Instruct 70B
Llama 3.1 Instruct 8B
Gemini 2.0 Flash (Feb '25)
Gemma 3 27B Instruct
Gemini 2.5 Pro Experimental (Mar' 25)
Claude 3.7 Sonnet (Standard)
Claude 3.7 Sonnet (Extended Thinking)
Mistral Large 2 (Nov '24)
Mistral Small 3.1
DeepSeek R1
DeepSeek V3 0324 (Mar' 25)
Grok 3 Reasoning Beta
Grok 3
Nova Pro
Nova Micro
Command A
QwQ 32B
DeepSeek V3
Models compared: OpenAI: GPT 4o Audio, GPT 4o Realtime, GPT 4o Speech Pipeline, GPT-3.5 Turbo, GPT-3.5 Turbo (0125), GPT-3.5 Turbo (0314), GPT-3.5 Turbo (1106), GPT-3.5 Turbo Instruct, GPT-4, GPT-4 Turbo, GPT-4 Turbo (0125), GPT-4 Turbo (1106), GPT-4 Vision, GPT-4.5 (Preview), GPT-4o (Aug '24), GPT-4o (ChatGPT), GPT-4o (May '24), GPT-4o (Nov '24), GPT-4o Realtime (Dec '24), GPT-4o mini, GPT-4o mini Realtime (Dec '24), o1, o1-mini, o1-preview, o1-pro, o3, o3-mini, and o3-mini (high), Meta: Code Llama 70B, Llama 2 Chat 13B, Llama 2 Chat 70B, Llama 2 Chat 7B, Llama 3 70B, Llama 3 8B, Llama 3.1 405B, Llama 3.1 70B, Llama 3.1 8B, Llama 3.2 11B (Vision), Llama 3.2 1B, Llama 3.2 3B, Llama 3.2 90B (Vision), and Llama 3.3 70B, Google: Gemini 1.0 Pro, Gemini 1.5 Flash (May), Gemini 1.5 Flash (Sep), Gemini 1.5 Flash-8B, Gemini 1.5 Pro (May), Gemini 1.5 Pro (Sep), Gemini 2.0 Flash, Gemini 2.0 Flash (exp), Gemini 2.0 Flash Thinking exp. (Dec '24), Gemini 2.0 Flash Thinking exp. (Jan '25), Gemini 2.0 Flash-Lite (Feb '25), Gemini 2.0 Flash-Lite (Preview), Gemini 2.0 Pro Experimental, Gemini 2.5 Pro Experimental, Gemini Experimental (Nov), Gemma 2 27B, Gemma 2 9B, Gemma 3 12B, Gemma 3 1B, Gemma 3 27B, Gemma 3 4B, and Gemma 7B, Anthropic: Claude 2.0, Claude 2.1, Claude 3 Haiku, Claude 3 Opus, Claude 3 Sonnet, Claude 3.5 Haiku, Claude 3.5 Sonnet (June), Claude 3.5 Sonnet (Oct), Claude 3.7 Sonnet Thinking, Claude 3.7 Sonnet, and Claude Instant, Mistral: Codestral (Jan '25), Codestral (May '24), Codestral-Mamba, Ministral 3B, Ministral 8B, Mistral 7B, Mistral Large (Feb '24), Mistral Large 2 (Jul '24), Mistral Large 2 (Nov '24), Mistral Medium, Mistral NeMo, Mistral Saba, Mistral Small (Feb '24), Mistral Small (Sep '24), Mistral Small 3, Mistral Small 3.1, Mixtral 8x22B, Mixtral 8x7B, Pixtral 12B, and Pixtral Large, DeepSeek: DeepSeek Coder V2 Lite, DeepSeek LLM 67B (V1), DeepSeek R1, DeepSeek R1 (FP4), DeepSeek R1 Distill Llama 70B, DeepSeek R1 Distill Llama 8B, DeepSeek R1 Distill Qwen 1.5B, DeepSeek R1 Distill Qwen 14B, DeepSeek R1 Distill Qwen 32B, DeepSeek V3, DeepSeek V3 (Mar' 25), DeepSeek-Coder-V2, DeepSeek-V2, DeepSeek-V2.5, DeepSeek-V2.5 (Dec '24), DeepSeek-VL2, and Janus Pro 7B, Perplexity: PPLX-70B Online, PPLX-7B-Online, R1 1776, Sonar, Sonar 3.1 Huge, Sonar 3.1 Large, Sonar 3.1 Small , Sonar Large, Sonar Pro, Sonar Reasoning, Sonar Reasoning Pro, and Sonar Small, xAI: Grok 2, Grok 3, Grok 3 Reasoning Beta, Grok 3 mini, Grok 3 mini Reasoning, Grok Beta, and Grok-1, OpenChat: OpenChat 3.5, Amazon: Nova Lite, Nova Micro, and Nova Pro, Microsoft Azure: Phi-3 Medium 14B, Phi-3 Mini, Phi-4, Phi-4 Mini, and Phi-4 Multimodal, Liquid AI: LFM 1.3B, LFM 3B, and LFM 40B, Upstage: Solar Mini, Solar Pro, and Solar Pro (Nov '24), Databricks: DBRX, MiniMax: MiniMax-Text-01, NVIDIA: Cosmos Nemotron 34B, Llama 3.1 Nemotron 70B, Llama 3.3 Nemotron Nano 8B v1, Llama 3.3 Nemotron Nano 8B v1 (Reasoning), Llama 3.3 Nemotron Super 49B v1, and Llama 3.3 Nemotron Super 49B v1 (Reasoning), IBM: Granite 3.0 2B, OpenVoice: Granite 3.0 8B, Inceptionlabs: Mercury Coder Mini, Mercury Coder Small, and Mercury Instruct, Reka AI: Reka Core, Reka Edge, Reka Flash (Feb '24), Reka Flash, and Reka Flash 3, Other: LLaVA-v1.5-7B, Cohere: Aya Expanse 32B, Aya Expanse 8B, Command, Command A, Command Light, Command R7B, Command-R, Command-R (Mar '24), Command-R+ (Apr '24), and Command-R+, AI21 Labs: Jamba 1.5 Large, Jamba 1.5 Large (Feb '25), Jamba 1.5 Mini, Jamba 1.5 Mini (Feb 2025), Jamba 1.6 Large, Jamba 1.6 Mini, and Jamba Instruct, Snowflake: Arctic, Alibaba: QwQ-32B, QwQ 32B-Preview, Qwen Chat 72B, Qwen Plus, Qwen Turbo, Qwen1.5 Chat 110B, Qwen1.5 Chat 14B, Qwen1.5 Chat 32B, Qwen1.5 Chat 72B, Qwen1.5 Chat 7B, Qwen2 72B, Qwen2 Instruct 7B, Qwen2 Instruct A14B 57B, Qwen2-VL 72B, Qwen2.5 Coder 32B, Qwen2.5 Coder 7B , Qwen2.5 Instruct 14B, Qwen2.5 Instruct 32B, Qwen2.5 72B, Qwen2.5 Instruct 7B, Qwen2.5 Max, Qwen2.5 Max 01-29, Qwen2.5 VL 72B, and Qwen2.5 VL 7B, and 01.AI: Yi-Large.