Comparisons of Large Open Source AI Models (>150B)

Open source AI models with over 150B parameters. Models are considered open source (also commonly referred to as open weights) where their weights are accessible to download. This allows self-hosting on your own infrastructure and enables customizing the model such as through fine-tuning. Click on any model to see detailed metrics. For more details including relating to our methodology, see our FAQs.
Z AI logoGLM-5.2 (max) and MiniMax logoMiniMax-M3 are the highest intelligence Large open source models, defined as those with >150B parameters, followed by DeepSeek logoDeepSeek V4 Pro (Max) & Kimi logoKimi K2.6.

Highlights

Artificial Analysis Openness Index · Higher is better
Updated
Artificial Analysis Intelligence Index · Higher is better
Trainable parameters in billions

Openness

Artificial Analysis Openness Index: Score

Openness Index assesses model openness on a 0 to 100 normalized scale (higher is more open)
Reasoning models are indicated by a lightbulb icon

Intelligence

Artificial Analysis Intelligence Index

Artificial Analysis Intelligence Index v4.1 incorporates 9 evaluations: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR
Estimate (independent evaluation forthcoming)
Reasoning models are indicated by a lightbulb icon

Artificial Analysis Intelligence Index v4.1 includes: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

Intelligence Evaluations

Intelligence evaluations measured independently by Artificial Analysis · Higher is better

Agentic real-world work tasks, (Elo-500)/2000

Agentic tool use

Agentic coding & terminal use

Coding

Reasoning & knowledge

Scientific reasoning

Physics reasoning

Long context reasoning

Agentic knowledge work, (Elo-500)/2000

Instruction following

Long-horizon agentic tasks

Kubernetes incident root-cause analysis

Visual reasoning

Reasoning models are indicated by a lightbulb icon.

While model intelligence generally translates across use cases, specific evaluations may be more relevant for certain use cases.

Artificial Analysis Intelligence Index v4.1 includes: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

Size

Model Size: Total and Active Parameters

Comparison between total model parameters and parameters active during inference
Reasoning models are indicated by a lightbulb icon

The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.

The number of parameters actually executed during each inference forward pass, expressed in billions. For Mixture of Experts (MoE) models, a routing mechanism selects a subset of experts per token, resulting in fewer active than total parameters. Dense models use all parameters, so active equals total.

Intelligence vs. Active Parameters

Active parameters at inference time · Artificial Analysis Intelligence Index
Most attractive quadrant
Reasoning models are indicated by a lightbulb icon.

Artificial Analysis Intelligence Index v4.1 includes: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

The number of parameters actually executed during each inference forward pass, expressed in billions. For Mixture of Experts (MoE) models, a routing mechanism selects a subset of experts per token, resulting in fewer active than total parameters. Dense models use all parameters, so active equals total.

Intelligence vs. Total Parameters

Artificial Analysis Intelligence Index · Size in parameters (billions)
Most attractive quadrant
Alibaba
DeepSeek
Kimi
MiniMax
Nex AGI
NVIDIA
Xiaomi
Z AI
Reasoning models are indicated by a lightbulb icon.

Artificial Analysis Intelligence Index v4.1 includes: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.

Context Window

Context Window

Context window: tokens limit · Higher is better
Reasoning models are indicated by a lightbulb icon

Larger context windows are relevant to RAG (Retrieval Augmented Generation) LLM workflows which typically involve reasoning and information retrieval of large amounts of data.

Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).

Further details

Weights
Provider Benchmarks
GLM-5.2 (max)
Z AI logoZ AI
51
753B
40B active at inference time
1.00M
$0.9
116
GMIBasetenNebius
+11
MiniMax-M3
MiniMax logoMiniMax
44
428B
23B active at inference time
1.00M
$0.2
78
SiliconFlowNovitaMiniMax
+4
DeepSeek V4 Pro (Reasoning, Max Effort)
DeepSeek logoDeepSeek
44
1.6KB
49B active at inference time
1.00M
$0.2
80
DeepInfraLightning AIMicrosoft Azure
+8
Kimi K2.6
Kimi logoKimi
43
1.0KB
32B active at inference time
256k
$0.7
65
ClarifaiCloudflareParasail
+14
MiMo-V2.5-Pro
Xiaomi logoXiaomi
42
1.0KB
42B active at inference time
1.00M
$0.2
45
NovitaGMIDeepInfraXiaomi
Kimi K2.7 Code
Kimi logoKimi
42
1.0KB
32B active at inference time
256k
$0.7
57
Together AIParasailCrusoe
+7
Nex-N2-Pro
Nex AGI logoNex AGI
41
397B
17B active at inference time
262k
$0.5
118
SiliconFlow
DeepSeek V4 Pro (Reasoning, High Effort)
DeepSeek logoDeepSeek
41
1.6KB
49B active at inference time
1.00M
$0.2
81
Lightning AIMakoraNovita
+8
DeepSeek V4 Flash (Reasoning, Max Effort)
DeepSeek logoDeepSeek
40
284B
13B active at inference time
1.00M
$0.1
97
GMIMakoraNovita
+4
GLM-5.1 (Reasoning)
Z AI logoZ AI
40
744B
40B active at inference time
200k
$0.9
83
SiliconFlowGMIParasail
+9
MiMo-V2.5
Xiaomi logoXiaomi
40
310B
15B active at inference time
1.00M
$0.1
87
DeepInfraXiaomiGMI
+2
MiniMax-M2.7
MiniMax logoMiniMax
38
230B
10B active at inference time
205k
$0.2
56
Together AIGMIFireworks
+3
Nemotron 3 Ultra 550B A55B (Reasoning)
NVIDIA logoNVIDIA
38
550B
55B active at inference time
262k
$0.6
155
Not available
DeepInfraGMILightning AI
+5
DeepSeek V4 Flash (Reasoning, High Effort)
DeepSeek logoDeepSeek
37
284B
13B active at inference time
1.00M
$0.1
-
CoreWeaveParasailDeepSeek
+5
GLM-5.1 (Non-reasoning)
Z AI logoZ AI
35
744B
40B active at inference time
200k
$0.9
56
DeepInfraParasailSiliconFlow
+5
Kimi K2.6 (Non-reasoning)
Kimi logoKimi
35
1.0KB
32B active at inference time
256k
$0.7
70
KimiClarifaiDatabricks
+11
Qwen3.5 397B A17B (Reasoning)
Alibaba logoAlibaba
34
397B
17B active at inference time
262k
$0.9
50
DigitalOceanClarifaiWafer
+9
Hy3-preview (Reasoning)
Tencent logoTencent
34
295B
21B active at inference time
256k
$0.1
121
SiliconFlowGMI
MiMo-V2-Flash (Feb 2026)
Xiaomi logoXiaomi
33
309B
15B active at inference time
256k
$0.1
93
Xiaomi
Qwen3.5 397B A17B (Non-reasoning)
Alibaba logoAlibaba
32
397B
17B active at inference time
262k
$0.9
53
WaferTogether AIAlibaba Cloud
+6
DeepSeek V4 Pro (Non-reasoning)
DeepSeek logoDeepSeek
31
1.6KB
49B active at inference time
1.00M
$0.2
83
Lightning AINebiusMicrosoft Azure
+2
Ring-2.6-1T
InclusionAI logoInclusionAI
31
1.0KB
63B active at inference time
262k
$0.5
128
InclusionAI
Step 3.7 Flash
StepFun logoStepFun
30
198B
11B active at inference time
262k
$0.2
388
StepFun
Command A+
Cohere logoCohere
29
218B
25B active at inference time
192k
-
155
Cohere
DeepSeek V4 Flash (Non-reasoning)
DeepSeek logoDeepSeek
29
284B
13B active at inference time
1.00M
$0.1
104
MakoraCoreWeaveGMIDeepSeek
MiMo-V2.5-Pro (Non-reasoning)
Xiaomi logoXiaomi
28
1.0KB
41.7B active at inference time
1.00M
$0.6
49
XiaomiNovitaDeepInfraGMI
Hy3-preview (Non-reasoning)
Tencent logoTencent
26
295B
21B active at inference time
256k
$0.1
124
GMISiliconFlow
Ling-2.6-1T
InclusionAI logoInclusionAI
26
1.0KB
63B active at inference time
262k
$0.5
-
InclusionAI
K-EXAONE (Reasoning)
LG AI Research logoLG AI Research
25
236B
23B active at inference time
256k
-
-
-
MiMo-V2-Flash (Non-reasoning)
Xiaomi logoXiaomi
25
309B
15B active at inference time
256k
$0.1
92
Xiaomi
Trinity Large Thinking
Arcee AI logoArcee AI
24
399B
13B active at inference time
512k
$0.2
208
ParasailArcee AI
K-EXAONE (Non-reasoning)
LG AI Research logoLG AI Research
17
236B
23B active at inference time
256k
-
-
-
Mistral Large 3
Mistral logoMistral
16
675B
41B active at inference time
256k
$0.6
49
Amazon BedrockMicrosoft AzureMistral
Llama 4 Maverick
Meta logoMeta
14
402B
17B active at inference time
1.00M
$0.3
103
Together AIDatabricksAmazon Bedrock
+6
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
NVIDIA logoNVIDIA
9
253B
128k
$0.7
51
Nebius
ERNIE 4.5 300B A47B
Baidu logoBaidu
9
300B
47B active at inference time
131k
$0.4
-
SiliconFlowNovita
Hermes 4 - Llama-3.1 405B (Reasoning)
Nous Research logoNous Research
9
406B
128k
$1.2
41
Nebius
Hermes 4 - Llama-3.1 405B (Non-reasoning)
Nous Research logoNous Research
9
406B
128k
$1.2
38
Nebius
Llama 3.1 Instruct 405B
Meta logoMeta
9
405B
128k
$3.1
75
DatabricksAmazon BedrockMicrosoft AzureAmazon Bedrock
R1 1776
Perplexity logoPerplexity
6
671B
37B active at inference time
128k
-
-
-
Jamba 1.7 Large
AI21 Labs logoAI21 Labs
5
398B
94B active at inference time
256k
$2.6
60
AI21 Labs
Cogito v2.1 (Reasoning)
Deep Cogito logoDeep Cogito
-
671B
37B active at inference time
128k
$1.3
92
Together AI