Comparisons of Large Open Source AI Models (>150B)
Open source AI models with over 150B parameters. Models are considered open source (also commonly referred to as open weights) where their weights are accessible to download. This allows self-hosting on your own infrastructure and enables customizing the model such as through fine-tuning. Click on any model to see detailed metrics. For more details including relating to our methodology, see our FAQs.
Kimi K2.6.Highlights
Openness
Artificial Analysis Openness Index: Score
Openness Index assesses model openness on a 0 to 100 normalized scale (higher is more open)
Reasoning models are indicated by a lightbulb icon
Intelligence
Artificial Analysis Intelligence Index
Artificial Analysis Intelligence Index v4.1 incorporates 9 evaluations: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR
Estimate (independent evaluation forthcoming)
Reasoning models are indicated by a lightbulb icon
Intelligence Evaluations
Intelligence evaluations measured independently by Artificial Analysis · Higher is better
GDPval-AA v2Updated
Agentic real-world work tasks, (Elo-500)/2000
𝜏³-BankingNew
Agentic tool use
Agentic coding & terminal use
Coding
Reasoning & knowledge
Scientific reasoning
Physics reasoning
Knowledge
1 - hallucination rate
Long context reasoning
AA-BriefcaseNew
Agentic knowledge work, (Elo-500)/2000
Instruction following
Long-horizon agentic tasks
Kubernetes incident root-cause analysis
Visual reasoning
Reasoning models are indicated by a lightbulb icon.
Size
Model Size: Total and Active Parameters
Comparison between total model parameters and parameters active during inference
Reasoning models are indicated by a lightbulb icon
Intelligence vs. Active Parameters
Active parameters at inference time · Artificial Analysis Intelligence Index
Most attractive quadrant
Reasoning models are indicated by a lightbulb icon.
Intelligence vs. Total Parameters
Artificial Analysis Intelligence Index · Size in parameters (billions)
Most attractive quadrant
Alibaba
DeepSeek
Kimi
MiniMax
Nex AGI
NVIDIA
Xiaomi
Z AI
Reasoning models are indicated by a lightbulb icon.
Context Window
Context Window
Context window: tokens limit · Higher is better
Reasoning models are indicated by a lightbulb icon
Further details
Weights | Provider Benchmarks | ||||||||
|---|---|---|---|---|---|---|---|---|---|
GLM-5.2 (max) | 51 | 753B 40B active at inference time | 1.00M | $0.9 | 116 | +11 | |||
MiniMax-M3 | 44 | 428B 23B active at inference time | 1.00M | $0.2 | 78 | +4 | |||
DeepSeek V4 Pro (Reasoning, Max Effort) | 44 | 1.6KB 49B active at inference time | 1.00M | $0.2 | 80 | +8 | |||
Kimi K2.6 | 43 | 1.0KB 32B active at inference time | 256k | $0.7 | 65 | +14 | |||
MiMo-V2.5-Pro | 42 | 1.0KB 42B active at inference time | 1.00M | $0.2 | 45 | ||||
Kimi K2.7 Code | 42 | 1.0KB 32B active at inference time | 256k | $0.7 | 57 | +7 | |||
Nex-N2-Pro | 41 | 397B 17B active at inference time | 262k | $0.5 | 118 | ||||
DeepSeek V4 Pro (Reasoning, High Effort) | 41 | 1.6KB 49B active at inference time | 1.00M | $0.2 | 81 | +8 | |||
DeepSeek V4 Flash (Reasoning, Max Effort) | 40 | 284B 13B active at inference time | 1.00M | $0.1 | 97 | +4 | |||
GLM-5.1 (Reasoning) | 40 | 744B 40B active at inference time | 200k | $0.9 | 83 | +9 | |||
MiMo-V2.5 | 40 | 310B 15B active at inference time | 1.00M | $0.1 | 87 | +2 | |||
MiniMax-M2.7 | 38 | 230B 10B active at inference time | 205k | $0.2 | 56 | +3 | |||
Nemotron 3 Ultra 550B A55B (Reasoning) | 38 | 550B 55B active at inference time | 262k | $0.6 | 155 | Not available | +5 | ||
DeepSeek V4 Flash (Reasoning, High Effort) | 37 | 284B 13B active at inference time | 1.00M | $0.1 | - | +5 | |||
GLM-5.1 (Non-reasoning) | 35 | 744B 40B active at inference time | 200k | $0.9 | 56 | +5 | |||
Kimi K2.6 (Non-reasoning) | 35 | 1.0KB 32B active at inference time | 256k | $0.7 | 70 | +11 | |||
Qwen3.5 397B A17B (Reasoning) | 34 | 397B 17B active at inference time | 262k | $0.9 | 50 | +9 | |||
Hy3-preview (Reasoning) | 34 | 295B 21B active at inference time | 256k | $0.1 | 121 | ||||
MiMo-V2-Flash (Feb 2026) | 33 | 309B 15B active at inference time | 256k | $0.1 | 93 | ||||
Qwen3.5 397B A17B (Non-reasoning) | 32 | 397B 17B active at inference time | 262k | $0.9 | 53 | +6 | |||
DeepSeek V4 Pro (Non-reasoning) | 31 | 1.6KB 49B active at inference time | 1.00M | $0.2 | 83 | +2 | |||
Ring-2.6-1T | 31 | 1.0KB 63B active at inference time | 262k | $0.5 | 128 | ||||
Step 3.7 Flash | 30 | 198B 11B active at inference time | 262k | $0.2 | 388 | ||||
Command A+ | 29 | 218B 25B active at inference time | 192k | - | 155 | ||||
DeepSeek V4 Flash (Non-reasoning) | 29 | 284B 13B active at inference time | 1.00M | $0.1 | 104 | ||||
MiMo-V2.5-Pro (Non-reasoning) | 28 | 1.0KB 41.7B active at inference time | 1.00M | $0.6 | 49 | ||||
Hy3-preview (Non-reasoning) | 26 | 295B 21B active at inference time | 256k | $0.1 | 124 | ||||
Ling-2.6-1T | 26 | 1.0KB 63B active at inference time | 262k | $0.5 | - | ||||
K-EXAONE (Reasoning) | 25 | 236B 23B active at inference time | 256k | - | - | - | |||
MiMo-V2-Flash (Non-reasoning) | 25 | 309B 15B active at inference time | 256k | $0.1 | 92 | ||||
Trinity Large Thinking | 24 | 399B 13B active at inference time | 512k | $0.2 | 208 | ||||
K-EXAONE (Non-reasoning) | 17 | 236B 23B active at inference time | 256k | - | - | - | |||
Mistral Large 3 | 16 | 675B 41B active at inference time | 256k | $0.6 | 49 | ||||
Llama 4 Maverick | 14 | 402B 17B active at inference time | 1.00M | $0.3 | 103 | +6 | |||
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) | 9 | 253B | 128k | $0.7 | 51 | ||||
ERNIE 4.5 300B A47B | 9 | 300B 47B active at inference time | 131k | $0.4 | - | ||||
Hermes 4 - Llama-3.1 405B (Reasoning) | 9 | 406B | 128k | $1.2 | 41 | ||||
Hermes 4 - Llama-3.1 405B (Non-reasoning) | 9 | 406B | 128k | $1.2 | 38 | ||||
Llama 3.1 Instruct 405B | 9 | 405B | 128k | $3.1 | 75 | ||||
R1 1776 | 6 | 671B 37B active at inference time | 128k | - | - | - | |||
Jamba 1.7 Large | 5 | 398B 94B active at inference time | 256k | $2.6 | 60 | ||||
Cogito v2.1 (Reasoning) | - | 671B 37B active at inference time | 128k | $1.3 | 92 |