Comparisons of Small Open Source AI Models (4B-40B)

Open source AI models with between 4B to 40B parameters. Models are considered open source (also commonly referred to as open weights) where their weights are accessible to download. This allows self-hosting on your own infrastructure and enables customizing the model such as through fine-tuning. Click on any model to see detailed metrics. For more details including relating to our methodology, see our FAQs.
Alibaba logoQwen3.6 27B and Alibaba logoQwen3.6 35B A3B are the highest intelligence Small open source models, defined as those with 4B-40B parameters, followed by Google logoGemma 4 31B & Alibaba logoQwen3.6 27B.

Highlights

Artificial Analysis Openness Index · Higher is better
Updated
Artificial Analysis Intelligence Index · Higher is better
Trainable parameters in billions

Openness

Artificial Analysis Openness Index: Score

Openness Index assesses model openness on a 0 to 100 normalized scale (higher is more open)
Reasoning models are indicated by a lightbulb icon

Intelligence

Artificial Analysis Intelligence Index

Artificial Analysis Intelligence Index v4.1 incorporates 9 evaluations: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR
Estimate (independent evaluation forthcoming)
Reasoning models are indicated by a lightbulb icon

Artificial Analysis Intelligence Index v4.1 includes: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

Intelligence Evaluations

Intelligence evaluations measured independently by Artificial Analysis · Higher is better

Agentic real-world work tasks, (Elo-500)/2000

Agentic coding & terminal use

Agentic tool use

Long context reasoning

Reasoning & knowledge

Scientific reasoning

Coding

Instruction following

Physics reasoning

Long-horizon agentic tasks

Kubernetes incident root-cause analysis

Visual reasoning

Reasoning models are indicated by a lightbulb icon.

While model intelligence generally translates across use cases, specific evaluations may be more relevant for certain use cases.

Artificial Analysis Intelligence Index v4.1 includes: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

Size

Model Size: Total and Active Parameters

Comparison between total model parameters and parameters active during inference
Reasoning models are indicated by a lightbulb icon

The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.

The number of parameters actually executed during each inference forward pass, expressed in billions. For Mixture of Experts (MoE) models, a routing mechanism selects a subset of experts per token, resulting in fewer active than total parameters. Dense models use all parameters, so active equals total.

Intelligence vs. Active Parameters

Active parameters at inference time · Artificial Analysis Intelligence Index
Most attractive quadrant
Reasoning models are indicated by a lightbulb icon.

Artificial Analysis Intelligence Index v4.1 includes: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

The number of parameters actually executed during each inference forward pass, expressed in billions. For Mixture of Experts (MoE) models, a routing mechanism selects a subset of experts per token, resulting in fewer active than total parameters. Dense models use all parameters, so active equals total.

Intelligence vs. Total Parameters

Artificial Analysis Intelligence Index · Size in parameters (billions)
Most attractive quadrant
Alibaba
Cohere
Google
LG AI Research
NVIDIA
OpenAI
ServiceNow
Reasoning models are indicated by a lightbulb icon.

Artificial Analysis Intelligence Index v4.1 includes: GDPval-AA v2, 𝜏³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, AA-LCR. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.

The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.

Context Window

Context Window

Context window: tokens limit · Higher is better
Reasoning models are indicated by a lightbulb icon

Larger context windows are relevant to RAG (Retrieval Augmented Generation) LLM workflows which typically involve reasoning and information retrieval of large amounts of data.

Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).

Further details

Weights
Provider Benchmarks
Qwen3.6 27B (Reasoning)
Alibaba logoAlibaba
37
27.8B
262k
$0.9
56
DeepInfraNovitaSiliconFlow
+2
Qwen3.6 35B A3B (Reasoning)
Alibaba logoAlibaba
32
36B
3B active at inference time
262k
$0.4
169
ScalewayAlibaba CloudClarifai
+6
Gemma 4 31B (Reasoning)
Google logoGoogle
29
30.7B
256k
-
34
NovitaSiliconFlowFriendliAI
+8
Qwen3.6 27B (Non-reasoning)
Alibaba logoAlibaba
29
27.8B
262k
$0.9
57
Alibaba CloudMakoraDeepInfraNovita
Gemma 4 26B A4B (Reasoning)
Google logoGoogle
26
25.2B
3.8B active at inference time
256k
$0.1
-
NovitaParasailCloudflare
+4
Qwen3.5 9B (Reasoning)
Alibaba logoAlibaba
25
9.65B
262k
$0.1
57
SiliconFlowTogether AI
Gemma 4 31B (Non-reasoning)
Google logoGoogle
25
30.7B
256k
$0.2
36
ParasailFriendliAINovita
+4
Qwen3.6 35B A3B (Non-reasoning)
Alibaba logoAlibaba
24
36B
3B active at inference time
262k
$0.6
188
ClarifaiParasailNovita
+5
Qwen3.5 35B A3B (Non-reasoning)
Alibaba logoAlibaba
23
36B
3B active at inference time
262k
$0.4
179
Alibaba CloudDeepInfra
EXAONE 4.5 33B
LG AI Research logoLG AI Research
23
34.4B
262k
-
-
-
Gemma 4 12B (Reasoning)
Google logoGoogle
22
12B
256k
$0.1
121
SiliconFlow
Nemotron Cascade 2 30B A3B
NVIDIA logoNVIDIA
21
31.6B
3B active at inference time
1.00M
-
-
-
North Mini Code
Cohere logoCohere
21
30B
3B active at inference time
256k
-
183
Not available
Cohere
Apriel-v1.6-15B-Thinker
ServiceNow logoServiceNow
21
15B
128k
-
-
Together AI
Qwen3.5 9B (Non-reasoning)
Alibaba logoAlibaba
20
9.65B
262k
-
-
-
Gemma 4 26B A4B (Non-reasoning)
Google logoGoogle
20
25.2B
3.8B active at inference time
256k
$0.2
43
GMIParasailSiliconFlow
+4
Qwen3.5 4B (Reasoning)
Alibaba logoAlibaba
20
4.66B
262k
$0.0
23
DeepInfra
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)
NVIDIA logoNVIDIA
18
31.6B
3.6B active at inference time
1.00M
$0.1
47
NebiusDeepInfra
HyperCLOVA X SEED Think (32B)
Naver logoNaver
17
32B
128k
-
-
-
Qwen3.5 4B (Non-reasoning)
Alibaba logoAlibaba
16
4.66B
262k
$0.0
22
DeepInfra
Nemotron 3 Nano Omni 30B A3B Reasoning
NVIDIA logoNVIDIA
15
30B
3B active at inference time
256k
$0.1
289
ClarifaiNebius
gpt-oss-20B (high)
OpenAI logoOpenAI
15
21B
3.6B active at inference time
131k
$0.1
212
DeepInfraTogether AINovita
+10
gpt-oss-20B (low)
OpenAI logoOpenAI
14
21B
3.6B active at inference time
131k
$0.1
225
NovitaCloudflareGoogle
+9
Tri-21B-think Preview
Trillion Labs logoTrillion Labs
14
21B
32.0k
-
-
-
Gemma 4 12B (Non-reasoning)
Google logoGoogle
13
12B
262k
-
-
-
Devstral Small 2
Mistral logoMistral
13
24B
256k
-
50
Mistral
Gemma 4 E4B (Reasoning)
Google logoGoogle
12
8B
4.5B active at inference time
128k
-
-
-
Tri-21B-Think
Trillion Labs logoTrillion Labs
12
21B
32.0k
-
-
-
Magistral Small 1.2
Mistral logoMistral
12
24B
128k
$0.6
107
Amazon BedrockMistral
EXAONE 4.0 32B (Reasoning)
LG AI Research logoLG AI Research
11
32B
131k
-
-
-
Ministral 3 14B
Mistral logoMistral
10
14B
256k
$0.2
93
MistralAmazon Bedrock
Falcon-H1R-7B
TII UAE logoTII UAE
10
7B
256k
-
-
-
Qwen3 Omni 30B A3B (Reasoning)
Alibaba logoAlibaba
10
35.3B
3B active at inference time
65.5k
$0.3
88
Alibaba Cloud
Step3 VL 10B
StepFun logoStepFun
9
10.2B
65.5k
-
-
-
Gemma 4 E2B (Reasoning)
Google logoGoogle
9
5.1B
2.3B active at inference time
128k
-
-
-
NVIDIA Nemotron Nano 12B v2 VL (Reasoning)
NVIDIA logoNVIDIA
9
13.2B
128k
$0.2
280
DeepInfra
Ministral 3 8B
Mistral logoMistral
9
8B
256k
$0.1
90
Amazon BedrockMistral
Gemma 4 E4B (Non-reasoning)
Google logoGoogle
9
8B
4.5B active at inference time
128k
-
-
-
Granite 4.1 30B
IBM logoIBM
9
30B
131k
-
-
-
NVIDIA Nemotron Nano 9B V2 (Reasoning)
NVIDIA logoNVIDIA
9
9B
131k
$0.1
73
DeepInfra
Olmo 3.1 32B Think
Allen Institute for AI logoAllen Institute for AI
8
32.2B
65.5k
-
-
Parasail
NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning)
NVIDIA logoNVIDIA
7
31.6B
3.6B active at inference time
1.00M
$0.1
61
DeepInfra
NVIDIA Nemotron Nano 9B V2 (Non-reasoning)
NVIDIA logoNVIDIA
7
9B
131k
$0.1
104
DeepInfraAmazon Bedrock
Granite 4.1 8B
IBM logoIBM
7
8B
131k
$0.1
120
CoreWeave
Sarvam 30B (high)
Sarvam logoSarvam
7
32.2B
2.4B active at inference time
65.5k
$0.0
166
Sarvam
Olmo 3.1 32B Instruct
Allen Institute for AI logoAllen Institute for AI
6
32.2B
65.5k
-
-
-
Gemma 4 E2B (Non-reasoning)
Google logoGoogle
6
5.1B
2.3B active at inference time
128k
-
-
-
EXAONE 4.0 32B (Non-reasoning)
LG AI Research logoLG AI Research
6
32B
131k
-
-
-
DeepHermes 3 - Mistral 24B Preview (Non-reasoning)
Nous Research logoNous Research
5
24B
32.0k
-
-
-
Granite 4.0 H Small
IBM logoIBM
5
32B
9B active at inference time
128k
$0.1
400
Replicate
Qwen3 Omni 30B A3B Instruct
Alibaba logoAlibaba
5
35.3B
3B active at inference time
65.5k
$0.3
95
Alibaba Cloud
LFM2 24B A2B
Liquid AI logoLiquid AI
5
23.8B
2.3B active at inference time
32.8k
$0.0
117
Together AI
Phi-4
Microsoft logoMicrosoft
5
14B
16.0k
$0.2
35
DeepInfraMicrosoft Azure
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)
NVIDIA logoNVIDIA
5
13.2B
128k
$0.2
215
Amazon BedrockDeepInfra
Phi-4 Multimodal Instruct
Microsoft logoMicrosoft
5
5.6B
128k
-
14
Microsoft Azure
Reka Flash 3
Reka AI logoReka AI
4
21B
128k
$0.3
-
Reka AI
Olmo 3 7B Think
Allen Institute for AI logoAllen Institute for AI
4
7B
65.5k
-
-
-
Molmo 7B-D
Allen Institute for AI logoAllen Institute for AI
4
8.02B
4.10k
-
-
-
Ling-mini-2.0
InclusionAI logoInclusionAI
4
16.3B
1.4B active at inference time
131k
-
-
-
Llama 3.2 Instruct 11B (Vision)
Meta logoMeta
3
11B
128k
$0.2
49
DeepInfraMicrosoft AzureAmazon Bedrock
Olmo 3 7B Instruct
Allen Institute for AI logoAllen Institute for AI
3
7B
65.5k
$0.1
-
Parasail
DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)
Nous Research logoNous Research
2
8B
128k
-
-
-
Molmo2-8B
Allen Institute for AI logoAllen Institute for AI
2
8.66B
36.9k
-
-
-
LFM2 8B A1B
Liquid AI logoLiquid AI
2
8.34B
1.5B active at inference time
32.8k
-
-
Liquid AI
Apertus 8B Instruct
Swiss AI Initiative logoSwiss AI Initiative
1
8B
65.5k
$0.1
-
Public AI
EXAONE 4.5 33B (Non-reasoning)
LG AI Research logoLG AI Research
-
34.4B
262k
-
-
-