Stay connected with us on X, Discord, and LinkedIn to stay up to date with future analysis

Comparisons of Large Open Source AI Models (>150B)

Open source AI models with over 150B parameters. Models are considered Open Source (also commonly referred to as open weights) where their weights are accessible to download. This allows self-hosting on your own infrastructure and enables customizing the model such as through fine-tuning. Click on any model to see detailed metrics. For more details including relating to our methodology, see our FAQs.

Z AI logoGLM-5 and Kimi logoKimi K2.5 are the highest intelligence Large open source models, defined as those with >150B parameters, followed by Alibaba logoQwen3.5 397B A17B & DeepSeek logoDeepSeek V3.2.

Intelligence
Artificial Analysis Intelligence Index; Higher is better
Total Parameters
Trainable parameters in billions

Openness

Artificial Analysis Openness Index: Results

Openness Index assesses model openness on a 0 to 100 normalized scale (higher is more open)
+ Add model from specific provider

Intelligence

Artificial Analysis Intelligence Index

Artificial Analysis Intelligence Index v4.0 incorporates 10 evaluations: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt
+ Add model from specific provider
Reasoning models are indicated by a lightbulb icon.

Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

Intelligence Evaluations

Intelligence evaluations measured independently by Artificial Analysis; Higher is better
+ Add model from specific provider
Results claimed by AI Lab (not yet independently verified)
GDPval-AA (Agentic Real-World Work Tasks, (ELO-500)/2000)
Terminal-Bench Hard (Agentic Coding & Terminal Use)
𝜏²-Bench Telecom (Agentic Tool Use)
AA-LCR (Long Context Reasoning)
AA-Omniscience Accuracy (Knowledge)
AA-Omniscience Non-Hallucination Rate (1 - Hallucination Rate)
Humanity's Last Exam (Reasoning & Knowledge)
GPQA Diamond (Scientific Reasoning)
SciCode (Coding)
IFBench (Instruction Following)
CritPt (Physics Reasoning)
MMMU Pro (Visual Reasoning)
Reasoning models are indicated by a lightbulb icon.

While model intelligence generally translates across use cases, specific evaluations may be more relevant for certain use cases.

Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

Size

Model Size: Total and Active Parameters

Comparison between total model parameters and parameters active during inference
+ Add model from specific provider
Active Parameters
Passive Parameters
Reasoning models are indicated by a lightbulb icon.

The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.

The number of parameters actually executed during each inference forward pass, expressed in billions. For Mixture of Experts (MoE) models, a routing mechanism selects a subset of experts per token, resulting in fewer active than total parameters. Dense models use all parameters, so active equals total.

Intelligence vs. Active Parameters

Active Parameters at Inference Time; Artificial Analysis Intelligence Index
+ Add model from specific provider
Most attractive quadrant
Alibaba
DeepSeek
Kimi
LG AI Research
Meta
Mistral
StepFun
Xiaomi
Z AI
Reasoning models are indicated by a lightbulb icon.

Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

The number of parameters actually executed during each inference forward pass, expressed in billions. For Mixture of Experts (MoE) models, a routing mechanism selects a subset of experts per token, resulting in fewer active than total parameters. Dense models use all parameters, so active equals total.

Intelligence vs. Total Parameters

Artificial Analysis Intelligence Index; Size in Parameters (Billions)
+ Add model from specific provider
Most attractive quadrant
Alibaba
DeepSeek
Kimi
LG AI Research
Meta
Mistral
StepFun
Xiaomi
Z AI
Reasoning models are indicated by a lightbulb icon.

Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.

Context Window

Context Window

Context Window: Tokens Limit; Higher is better
+ Add model from specific provider
Reasoning models are indicated by a lightbulb icon.

Larger context windows are relevant to RAG (Retrieval Augmented Generation) LLM workflows which typically involve reasoning and information retrieval of large amounts of data.

Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).

Further details
WeightsProvider
Benchmarks
Z AI logo
GLM-5 (Reasoning)
Z AI
50
744B
(40B active at inference time)
200k
$1.6
69
🤗
DeepInfra
Nebius
Eigen AI
+10 more
View
Kimi logo
Kimi K2.5 (Reasoning)
Kimi
47
1.0KB
(32B active at inference time)
256k
$1.2
47
🤗
SiliconFlow
Clarifai
Kimi
+14 more
View
Alibaba logo
Qwen3.5 397B A17B (Reasoning)
Alibaba
45
397B
(17B active at inference time)
262k
$1.4
84
🤗
GMI
Nebius
Clarifai
+5 more
View
DeepSeek logo
DeepSeek V3.2 (Reasoning)
DeepSeek
42
685B
(37B active at inference time)
128k
$0.3
33
🤗
SiliconFlow
DeepSeek
Novita
+8 more
View
Xiaomi logo
MiMo-V2-Flash (Feb 2026)
Xiaomi
41
309B
(15B active at inference time)
256k
$0.1
129
🤗
Xiaomi
View
Z AI logo
GLM-5 (Non-reasoning)
Z AI
41
744B
(40B active at inference time)
200k
$1.6
68
🤗
Novita
Eigen AI
Baseten
+4 more
View
Alibaba logo
Qwen3.5 397B A17B (Non-reasoning)
Alibaba
40
397B
(17B active at inference time)
262k
$1.4
84
🤗
Novita
Alibaba Cloud
Nebius
+2 more
View
StepFun logo
Step 3.5 Flash
StepFun
38
196B
(11B active at inference time)
256k
$0.1
117
🤗
StepFun
SiliconFlow
View
Kimi logo
Kimi K2.5 (Non-reasoning)
Kimi
37
1.0KB
(32B active at inference time)
256k
$1.2
45
🤗
GMI
Eigen AI
Nebius
+6 more
View
LG AI Research logo
K-EXAONE (Reasoning)
LG AI Research
32
236B
(23B active at inference time)
256k
-
-
🤗
-
View
DeepSeek logo
DeepSeek V3.2 (Non-reasoning)
DeepSeek
32
685B
(37B active at inference time)
128k
$0.3
33
🤗
SambaNova
Nebius
Novita
+11 more
View
Xiaomi logo
MiMo-V2-Flash (Non-reasoning)
Xiaomi
30
309B
(15B active at inference time)
256k
$0.1
138
🤗
Xiaomi
View
Alibaba logo
Qwen3 235B A22B 2507 (Reasoning)
Alibaba
30
235B
(22B active at inference time)
256k
$2.6
43
🤗
Weights & Biases
Nebius
Novita
+4 more
View
DeepSeek logo
DeepSeek V3.2 Speciale
DeepSeek
29
685B
(37B active at inference time)
128k
-
-
🤗
-
View
Alibaba logo
Qwen3 VL 235B A22B (Reasoning)
Alibaba
28
235B
(22B active at inference time)
262k
$2.6
52
🤗
Alibaba Cloud
Novita
View
DeepSeek logo
DeepSeek R1 0528 (May '25)
DeepSeek
27
685B
(37B active at inference time)
128k
$2.4
-
🤗
Nebius
SambaNova
Nebius
+6 more
View
Alibaba logo
Qwen3 235B A22B 2507 Instruct
Alibaba
25
235B
(22B active at inference time)
256k
$1.2
62
🤗
Together.ai
FriendliAI
Hyperbolic
+10 more
View
Alibaba logo
Qwen3 Coder 480B A35B Instruct
Alibaba
25
480B
(35B active at inference time)
262k
$3.0
62
🤗
DeepInfra
Novita
Together.ai
+8 more
View
LG AI Research logo
K-EXAONE (Non-reasoning)
LG AI Research
23
236B
(23B active at inference time)
256k
-
-
🤗
-
View
Mistral logo
Mistral Large 3
Mistral
23
675B
(41B active at inference time)
256k
$0.8
49
🤗
Mistral
Amazon Bedrock
Microsoft Azure
View
InclusionAI logo
Ring-1T
InclusionAI
23
1.0KB
(50B active at inference time)
128k
-
-
🤗
-
View
Alibaba logo
Qwen3 VL 235B A22B Instruct
Alibaba
21
235B
(22B active at inference time)
262k
$1.2
59
🤗
Alibaba Cloud
Novita
Eigen AI
+2 more
View
InclusionAI logo
Ling-1T
InclusionAI
19
1.0KB
(50B active at inference time)
128k
-
-
🤗
-
View
Nous Research logo
Hermes 4 - Llama-3.1 405B (Reasoning)
Nous Research
19
406B
128k
$1.5
29
🤗
Nebius
View
Meta logo
Llama 4 Maverick
Meta
18
402B
(17B active at inference time)
1.00M
$0.5
123
🤗
SambaNova
DeepInfra
Microsoft Azure
+10 more
View
Nous Research logo
Hermes 4 - Llama-3.1 405B (Non-reasoning)
Nous Research
18
406B
128k
$1.5
31
🤗
Nebius
View
Meta logo
Llama 3.1 Instruct 405B
Meta
17
405B
128k
$4.4
34
🤗
Amazon Bedrock
Amazon Bedrock
Databricks
+2 more
View
NVIDIA logo
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
NVIDIA
15
253B
128k
$0.9
42
🤗
Nebius
View
Baidu logo
ERNIE 4.5 300B A47B
Baidu
15
300B
(47B active at inference time)
131k
$0.5
34
🤗
SiliconFlow
Novita
View
Perplexity logo
R1 1776
Perplexity
12
671B
(37B active at inference time)
128k
-
-
🤗
-
View
AI21 Labs logo
Jamba 1.7 Large
AI21 Labs
11
398B
(94B active at inference time)
256k
$3.5
59
🤗
AI21 Labs
View
Deep Cogito logo
Cogito v2.1 (Reasoning)
Deep Cogito
-
671B
(37B active at inference time)
128k
$1.3
86
🤗
Together.ai
View