Comparisons of Small Open Source AI Models (4B-40B)
Open source AI models with less than 40 billion parameters. Models are considered Open Source (also commonly referred to as open weights) where their weights are accessible to download. This allows self-hosting on your own infrastructure and enables customizing the model such as through fine-tuning. Click on any model to see detailed metrics. For more details including relating to our methodology, see our FAQs.
Apriel-v1.5-15B-Thinker and
Qwen3 30B A3B 2507 are the highest intelligence Small open source models, defined as those with 4B-40B parameters, followed by
gpt-oss-20B (high) &
Magistral Small 1.2.
Highlights
Intelligence
Artificial Analysis Intelligence Index; Higher is better
Estimate (independent evaluation forthcoming)
Total Parameters
Trainable parameters in billions
Further details
Weights | Provider Benchmarks | |||||||
---|---|---|---|---|---|---|---|---|
Apriel-v1.5-15B-Thinker ServiceNow | 52 | 15B | 128k | - | - | Not available | - | View |
Qwen3 30B A3B 2507 (Reasoning) Alibaba | 46 | 30.5B (3.3B active at inference time) | 262k | $0.8 | 101 | 🤗 | View | |
gpt-oss-20B (high) OpenAI | 43 | 21B (3.6B active at inference time) | 131k | $0.1 | 156 | 🤗 | ![]() ![]() +9 more | View |
![]() Magistral Small 1.2 Mistral | 43 | 24B | 128k | $0.8 | 217 | 🤗 | ![]() | View |
![]() EXAONE 4.0 32B (Reasoning) LG AI Research | 43 | 32B | 131k | $0.7 | 64 | 🤗 | View | |
Qwen3 Omni 30B A3B (Reasoning) Alibaba | 40 | 35.3B (3.0B active at inference time) | 65.5k | $0.4 | 88 | 🤗 | View | |
QwQ 32B Alibaba | 38 | 32.8B | 131k | $0.5 | 46 | 🤗 | +3 more | View |
Qwen3 30B A3B 2507 Instruct Alibaba | 37 | 30.5B (3.3B active at inference time) | 262k | $0.3 | 87 | 🤗 | View | |
NVIDIA Nemotron Nano 9B V2 (Reasoning) NVIDIA | 37 | 9B | 131k | $0.1 | 126 | 🤗 | ![]() | View |
NVIDIA Nemotron Nano 9B V2 (Non-reasoning) NVIDIA | 36 | 9B | 131k | $0.1 | 125 | 🤗 | ![]() | View |
DeepSeek R1 0528 Qwen3 8B DeepSeek | 35 | 8.19B | 32.8k | $0.1 | 40 | 🤗 | View | |
Qwen3 Coder 30B A3B Instruct Alibaba | 33 | 30.5B (3.3B active at inference time) | 262k | $0.9 | 95 | 🤗 | +1 more | View |
Reka Flash 3 Reka AI | 33 | 21B | 128k | $0.3 | 51 | 🤗 | View | |
![]() EXAONE 4.0 32B (Non-reasoning) LG AI Research | 30 | 32B | 131k | $0.7 | 58 | 🤗 | View | |
Qwen3 Omni 30B A3B Instruct Alibaba | 30 | 35.3B (3.0B active at inference time) | 65.5k | $0.4 | 80 | 🤗 | View | |
![]() Mistral Small 3.2 Mistral | 29 | 24B | 128k | $0.1 | 99 | 🤗 | ![]() ![]() | View |
![]() Devstral Small (Jul '25) Mistral | 27 | 24B | 256k | $0.1 | 157 | 🤗 | ![]() ![]() | View |
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning) NVIDIA | 26 | 4.51B | 128k | - | - | 🤗 | - | View |
Phi-4 Microsoft Azure | 25 | 14B | 16.0k | $0.2 | 34 | 🤗 | ![]() | View |
Granite 4.0 H Small IBM | 23 | 32B (9B active at inference time) | 128k | $0.1 | 52 | Not available | ![]() | View |
Gemma 3 27B Instruct Google | 22 | 27.4B | 128k | - | 52 | 🤗 | ![]() | View |
Gemma 3 12B Instruct Google | 20 | 12.2B | 128k | - | 51 | 🤗 | ![]() +1 more | View |
![]() DeepHermes 3 - Mistral 24B Preview (Non-reasoning) Nous Research | 16 | 24B | 32.0k | - | - | 🤗 | - | View |
Llama 3.2 Instruct 11B (Vision) Meta | 16 | 11B | 128k | $0.2 | 58 | 🤗 | ![]() | View |
Gemma 3n E4B Instruct Google | 15 | 8.39B (4.0B active at inference time) | 32.0k | $0.0 | 51 | 🤗 | View | |
Granite 3.3 8B (Non-reasoning) IBM | 15 | 8.17B | 128k | $0.1 | 510 | 🤗 | ![]() | View |
Phi-4 Multimodal Instruct Microsoft Azure | 12 | 5.6B | 128k | - | 18 | 🤗 | View | |
Gemma 3n E2B Instruct Google | 8 | 5.98B (2.0B active at inference time) | 32.0k | - | 56 | 🤗 | View | |
![]() Ministral 8B Mistral | 8 | 8B | 128k | $0.1 | 188 | 🤗 | ![]() | View |
![]() Aya Expanse 32B Cohere | 6 | 32B | 128k | $0.8 | 70 | 🤗 | ![]() | View |
![]() DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning) Nous Research | 2 | 8B | 128k | - | - | 🤗 | - | View |
![]() Aya Expanse 8B Cohere | 2 | 8B | 8.00k | $0.8 | 83 | 🤗 | ![]() | View |