Comparison of Open Source Models
Comparison and analysis of open source AI models across key performance metrics including quality, performance, inference speed, context window, parameter count & licensing details. Models are considered Open Source (also commonly referred to as open weights) where their weights are accessible to download. This allows self-hosting on your own infrastructure and enables customizing the model such as through fine-tuning. Click on any model to see detailed metrics. For more details relating to our methodology, see our FAQs.
GLM-5 and
Kimi K2.5 are the highest intelligence open source models, followed by Qwen3.5 397B A17B &
GLM-4.7.
Openness
Artificial Analysis Openness Index: Results
Open Source Progress
Progress in Open Weights vs. Proprietary Intelligence
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Indicates whether the model weights are available. Models are labelled as 'Commercial Use Restricted' if the weights are available but commercial use is limited (typically requires obtaining a paid license).
Artificial Analysis Intelligence Index
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
{"@context":"https://schema.org","@type":"Dataset","name":"Artificial Analysis Intelligence Index","creator":{"@type":"Organization","name":"Artificial Analysis","url":"https://artificialanalysis.ai"},"description":"Artificial Analysis Intelligence Index: Includes GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt evaluations spanning reasoning, knowledge, math & coding; Evaluation results measured independently by Artificial Analysis","measurementTechnique":"Independent test run by Artificial Analysis on dedicated hardware.","spatialCoverage":"Worldwide","keywords":["analytics","llm","AI","benchmark","model","gpt","claude"],"license":"https://artificialanalysis.ai/docs/legal/Terms-of-Use.pdf","isAccessibleForFree":true,"citation":"Artificial Analysis (2025). LLM benchmarks dataset. https://artificialanalysis.ai","data":""}
Open Source Language Models Intelligence By Lab Over Time
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Open Source Models Intelligence By Size Over Time
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
- Tiny: Less than or equal to 4B parameters. These are usually the smallest models in terms of resource demand.
- Small: Less than 40B parameters.
- Medium: Between 40B-150B parameters.
- Large: Over 150B parameters.
Intelligence Evaluations
While model intelligence generally translates across use cases, specific evaluations may be more relevant for certain use cases.
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Size
Intelligence Index By Model Size
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Indicates whether the model weights are available. Models are labelled as 'Commercial Use Restricted' if the weights are available but commercial use is limited (typically requires obtaining a paid license).
- Tiny: Less than or equal to 4B parameters. These are usually the smallest models in terms of resource demand.
- Small: Less than 40B parameters.
- Medium: Between 40B-150B parameters.
- Large: Over 150B parameters.
Model Size: Total and Active Parameters
The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.
The number of parameters actually executed during each inference forward pass, expressed in billions. For Mixture of Experts (MoE) models, a routing mechanism selects a subset of experts per token, resulting in fewer active than total parameters. Dense models use all parameters, so active equals total.
Intelligence vs. Active Parameters
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
The number of parameters actually executed during each inference forward pass, expressed in billions. For Mixture of Experts (MoE) models, a routing mechanism selects a subset of experts per token, resulting in fewer active than total parameters. Dense models use all parameters, so active equals total.
Intelligence vs. Total Parameters
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.
Context Window
Context Window
Larger context windows are relevant to RAG (Retrieval Augmented Generation) LLM workflows which typically involve reasoning and information retrieval of large amounts of data.
Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
{"@context":"https://schema.org","@type":"Dataset","name":"Context Window","creator":{"@type":"Organization","name":"Artificial Analysis","url":"https://artificialanalysis.ai"},"description":"Context window is the maximum number of tokens a model can accept in a single request. Higher limits allow longer prompts, documents, and more complex instructions.","measurementTechnique":"Independent test run by Artificial Analysis on dedicated hardware.","spatialCoverage":"Worldwide","keywords":["analytics","llm","AI","benchmark","model","gpt","claude"],"license":"https://artificialanalysis.ai/docs/legal/Terms-of-Use.pdf","isAccessibleForFree":true,"citation":"Artificial Analysis (2025). LLM benchmarks dataset. https://artificialanalysis.ai","data":""}
| Weights | Provider Benchmarks | |||||||
|---|---|---|---|---|---|---|---|---|
GLM-5 (Reasoning) Z AI | 50 | 744B (40B active at inference time) | 200k | $1.6 | 67 | 🤗 | +5 more | View |
Kimi K2.5 (Reasoning) Kimi | 47 | 1.0KB (32B active at inference time) | 256k | $1.2 | 41 | 🤗 | +6 more | View |
Qwen3.5 397B A17B (Reasoning) Alibaba | 45 | 397B (17B active at inference time) | 262k | $1.4 | 87 | 🤗 | +1 more | View |
GLM-4.7 (Reasoning) Z AI | 42 | 357B (32B active at inference time) | 200k | $0.9 | 107 | 🤗 | +7 more | View |
Qwen3.5 27B (Reasoning) Alibaba | 42 | 27.8B | 262k | $0.8 | 100 | 🤗 | View | |
MiniMax-M2.5 MiniMax | 42 | 230B (10B active at inference time) | 205k | $0.5 | 57 | 🤗 | +6 more | View |
DeepSeek V3.2 (Reasoning) DeepSeek | 42 | 685B (37B active at inference time) | 128k | $0.3 | 46 | 🤗 | +5 more | View |
Qwen3.5 122B A10B (Reasoning) Alibaba | 42 | 125B (10B active at inference time) | 262k | $1.1 | 116 | 🤗 | View | |
MiMo-V2-Flash (Feb 2026) Xiaomi | 41 | 309B (15B active at inference time) | 256k | $0.1 | 154 | 🤗 | View | |
Kimi K2 Thinking Kimi | 41 | 1.0KB (32B active at inference time) | 256k | $1.1 | 66 | 🤗 | +5 more | View |
GLM-5 (Non-reasoning) Z AI | 41 | 744B (40B active at inference time) | 200k | $1.6 | 45 | 🤗 | +1 more | View |
Qwen3.5 397B A17B (Non-reasoning) Alibaba | 40 | 397B (17B active at inference time) | 262k | $1.4 | 88 | 🤗 | View | |
MiniMax-M2.1 MiniMax | 39 | 230B (10B active at inference time) | 205k | $0.5 | 53 | 🤗 | +3 more | View |
MiMo-V2-Flash (Reasoning) Xiaomi | 39 | 309B (15B active at inference time) | 256k | $0.1 | 167 | 🤗 | View | |
Kimi K2.5 (Non-reasoning) Kimi | 37 | 1.0KB (32B active at inference time) | 256k | $1.2 | 39 | 🤗 | +3 more | View |
Qwen3.5 35B A3B (Reasoning) Alibaba | 37 | 36B | 262k | $0.7 | 166 | 🤗 | View | |
MiniMax-M2 MiniMax | 36 | 230B (10B active at inference time) | 205k | $0.5 | 52 | 🤗 | +1 more | View |
GLM-4.7 (Non-reasoning) Z AI | 34 | 357B (32B active at inference time) | 200k | $0.9 | 104 | 🤗 | +5 more | View |
DeepSeek V3.2 Speciale DeepSeek | 34 | 685B (37B active at inference time) | 128k | - | - | 🤗 | - | View |
DeepSeek V3.1 Terminus (Reasoning) DeepSeek | 34 | 685B (37B active at inference time) | 128k | $0.8 | - | 🤗 | View | |
gpt-oss-120B (high) OpenAI | 33 | 117B (5.1B active at inference time) | 131k | $0.3 | 302 | 🤗 | +19 more | View |
DeepSeek V3.2 Exp (Reasoning) DeepSeek | 33 | 685B (37B active at inference time) | 128k | $0.3 | 44 | 🤗 | View | |
GLM-4.6 (Reasoning) Z AI | 33 | 357B (32B active at inference time) | 200k | $1.0 | 97 | 🤗 | +1 more | View |
K-EXAONE (Reasoning) LG AI Research | 32 | 236B (23B active at inference time) | 256k | - | - | 🤗 | - | View |
DeepSeek V3.2 (Non-reasoning) DeepSeek | 32 | 685B (37B active at inference time) | 128k | $0.3 | 45 | 🤗 | +7 more | View |
Kimi K2 0905 Kimi | 31 | 1.0KB (32B active at inference time) | 256k | $1.2 | 65 | 🤗 | +2 more | View |
MiMo-V2-Flash (Non-reasoning) Xiaomi | 30 | 309B (15B active at inference time) | 256k | $0.1 | 141 | 🤗 | View | |
GLM-4.6 (Non-reasoning) Z AI | 30 | 357B (32B active at inference time) | 200k | $1.0 | 77 | 🤗 | View | |
GLM-4.7-Flash (Reasoning) Z AI | 30 | 31.2B (3B active at inference time) | 200k | $0.1 | 61 | 🤗 | View | |
Qwen3 235B A22B 2507 (Reasoning) Alibaba | 30 | 235B (22B active at inference time) | 256k | $2.6 | 39 | 🤗 | +2 more | View |
DeepSeek V3.1 Terminus (Non-reasoning) DeepSeek | 29 | 685B (37B active at inference time) | 128k | $0.6 | - | 🤗 | +1 more | View |
DeepSeek V3.2 Exp (Non-reasoning) DeepSeek | 28 | 685B (37B active at inference time) | 128k | $0.3 | 46 | 🤗 | View | |
Apriel-v1.5-15B-Thinker ServiceNow | 28 | 15B | 128k | - | 144 | 🤗 | View | |
Qwen3 Coder Next Alibaba | 28 | 79.7B (3B active at inference time) | 256k | $0.5 | 120 | 🤗 | View | |
DeepSeek V3.1 (Non-reasoning) DeepSeek | 28 | 685B (37B active at inference time) | 128k | $0.8 | - | 🤗 | +6 more | View |
DeepSeek V3.1 (Reasoning) DeepSeek | 28 | 685B (37B active at inference time) | 128k | $0.9 | - | 🤗 | +1 more | View |
Qwen3 VL 235B A22B (Reasoning) Alibaba | 28 | 235B (22B active at inference time) | 262k | $2.6 | 31 | 🤗 | View | |
Apriel-v1.6-15B-Thinker ServiceNow | 28 | 15B | 128k | - | 141 | 🤗 | View | |
DeepSeek R1 0528 (May '25) DeepSeek | 27 | 685B (37B active at inference time) | 128k | $2.4 | - | 🤗 | +6 more | View |
Qwen3 Next 80B A3B (Reasoning) Alibaba | 27 | 80B (3B active at inference time) | 262k | $1.9 | 129 | 🤗 | +4 more | View |
GLM-4.5 (Reasoning) Z AI | 26 | 355B (32B active at inference time) | 128k | $0.8 | 38 | 🤗 | View | |
Kimi K2 Kimi | 26 | 1.0KB (32B active at inference time) | 128k | $1.1 | 44 | 🤗 | +4 more | View |
GLM-4.5-Air Z AI | 26 | 106B (12B active at inference time) | 128k | $0.4 | 121 | 🤗 | +1 more | View |
Seed-OSS-36B-Instruct ByteDance Seed | 25 | 36.2B | 512k | $0.3 | 32 | 🤗 | View | |
Qwen3 235B A22B 2507 Instruct Alibaba | 25 | 235B (22B active at inference time) | 256k | $1.2 | 43 | 🤗 | +7 more | View |
Qwen3 Coder 480B A35B Instruct Alibaba | 25 | 480B (35B active at inference time) | 262k | $3.0 | 39 | 🤗 | +7 more | View |
Qwen3 VL 32B (Reasoning) Alibaba | 25 | 33.4B | 256k | $2.6 | 85 | 🤗 | View | |
Qwen3 30B A3B 2507 (Reasoning) Alibaba | 25 | 30.5B (3.3B active at inference time) | 262k | $0.8 | 150 | 🤗 | View | |
K2-V2 (high) MBZUAI Institute of Foundation Models | 25 | 70B | 512k | - | - | 🤗 | - | View |
gpt-oss-120B (low) OpenAI | 24 | 117B (5.1B active at inference time) | 131k | $0.3 | 294 | 🤗 | +17 more | View |
gpt-oss-20B (high) OpenAI | 24 | 21B (3.6B active at inference time) | 131k | $0.1 | 293 | 🤗 | +7 more | View |
MiniMax M1 80k MiniMax | 24 | 456B (45.9B active at inference time) | 1.00M | $1.0 | - | 🤗 | View | |
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) NVIDIA | 24 | 31.6B (3.6B active at inference time) | 1.00M | $0.1 | 174 | 🤗 | View | |
K2 Think V2 MBZUAI Institute of Foundation Models | 24 | 70B | 262k | - | - | Not available | - | View |
Llama Nemotron Super 49B v1.5 (Reasoning) NVIDIA | 24 | 49B | 128k | $0.2 | 76 | 🤗 | View | |
INTELLECT-3 Prime Intellect | 24 | 107B | 131k | - | - | 🤗 | - | View |
HyperCLOVA X SEED Think (32B) Naver | 24 | 32B | 128k | - | - | 🤗 | - | View |
Qwen3 Next 80B A3B Instruct Alibaba | 24 | 80B (3B active at inference time) | 262k | $0.9 | 131 | 🤗 | +4 more | View |
Ling-1T InclusionAI | 24 | 1.0KB (50B active at inference time) | 128k | - | - | 🤗 | - | View |
K-EXAONE (Non-reasoning) LG AI Research | 23 | 236B (23B active at inference time) | 256k | - | - | 🤗 | - | View |
Qwen3 VL 235B A22B Instruct Alibaba | 23 | 235B (22B active at inference time) | 262k | $1.2 | 44 | 🤗 | +2 more | View |
DeepSeek R1 (Jan '25) DeepSeek | 23 | 685B (37B active at inference time) | 128k | $2.4 | - | 🤗 | +6 more | View |
Mi:dm K 2.5 Pro Korea Telecom | 23 | 32B | 128k | - | - | Not available | - | View |
Mistral Large 3 Mistral | 23 | 675B (41B active at inference time) | 256k | $0.8 | 50 | 🤗 | View | |
Qwen3 4B 2507 (Reasoning) Alibaba | 23 | 4.02B | 262k | - | - | 🤗 | - | View |
Magistral Small 1.2 Mistral | 23 | 24B | 128k | $0.8 | 210 | 🤗 | View | |
EXAONE 4.0 32B (Reasoning) LG AI Research | 22 | 32B | 131k | $0.7 | 99 | 🤗 | View | |
DeepSeek V3 0324 DeepSeek | 22 | 671B (37B active at inference time) | 128k | $1.3 | - | 🤗 | +6 more | View |
GLM-4.7-Flash (Non-reasoning) Z AI | 22 | 31.2B (3B active at inference time) | 200k | $0.2 | 69 | 🤗 | View | |
Ring-1T InclusionAI | 22 | 1.0KB (50B active at inference time) | 128k | - | - | 🤗 | - | View |
Qwen3 235B A22B (Reasoning) Alibaba | 22 | 235B (22B active at inference time) | 32.8k | $2.6 | 45 | 🤗 | View | |
Hermes 4 - Llama-3.1 405B (Reasoning) Nous Research | 22 | 406B | 128k | $1.5 | 33 | 🤗 | View | |
Qwen3 VL 32B Instruct Alibaba | 21 | 33.4B | 256k | $1.2 | 63 | 🤗 | View | |
GLM-4.6V (Reasoning) Z AI | 21 | 108B | 128k | $0.5 | 98 | 🤗 | +1 more | View |
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) NVIDIA | 21 | 13.2B | 128k | $0.3 | 126 | 🤗 | View | |
MiniMax M1 40k MiniMax | 21 | 456B (45.9B active at inference time) | 1.00M | - | - | 🤗 | - | View |
K2-V2 (medium) MBZUAI Institute of Foundation Models | 21 | 70B | 512k | - | - | 🤗 | - | View |
Qwen3 Omni 30B A3B (Reasoning) Alibaba | 21 | 35.3B (3B active at inference time) | 65.5k | $0.4 | 87 | 🤗 | View | |
gpt-oss-20B (low) OpenAI | 21 | 21B (3.6B active at inference time) | 131k | $0.1 | 248 | 🤗 | +8 more | View |
Ring-flash-2.0 InclusionAI | 21 | 103B (6.1B active at inference time) | 128k | $0.2 | 81 | 🤗 | View | |
Hermes 4 - Llama-3.1 70B (Reasoning) Nous Research | 20 | 70.6B | 128k | $0.2 | 80 | 🤗 | View | |
Qwen3 32B (Reasoning) Alibaba | 20 | 32.8B | 32.8k | $2.6 | 89 | 🤗 | +4 more | View |
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) NVIDIA | 20 | 253B | 128k | $0.9 | 38 | 🤗 | View | |
Qwen3 VL 30B A3B Instruct Alibaba | 20 | 30B (3B active at inference time) | 256k | $0.3 | 100 | 🤗 | +1 more | View |
Ling-flash-2.0 InclusionAI | 20 | 103B (6.1B active at inference time) | 128k | $0.2 | 60 | 🤗 | View | |
QwQ 32B Alibaba | 20 | 32.8B | 131k | $0.5 | 29 | 🤗 | View | |
Qwen3 VL 30B A3B (Reasoning) Alibaba | 20 | 30B (3B active at inference time) | 256k | $0.8 | 79 | 🤗 | View | |
GLM-4.5V (Reasoning) Z AI | 19 | 108B (12B active at inference time) | 64.0k | $0.9 | 40 | 🤗 | View | |
Qwen3 30B A3B 2507 Instruct Alibaba | 19 | 30.5B (3.3B active at inference time) | 262k | $0.3 | 76 | 🤗 | View | |
Qwen3 30B A3B (Reasoning) Alibaba | 19 | 30.5B (3.3B active at inference time) | 32.8k | $0.8 | 71 | 🤗 | +1 more | View |
Devstral 2 Mistral | 19 | 125B | 256k | - | 77 | 🤗 | View | |
Olmo 3 32B Think Allen Institute for AI | 19 | 32.2B | 65.5k | - | - | 🤗 | - | View |
NVIDIA Nemotron Nano 9B V2 (Non-reasoning) NVIDIA | 19 | 9B | 131k | $0.1 | 118 | 🤗 | View | |
Qwen3 14B (Reasoning) Alibaba | 19 | 14.8B | 32.8k | $1.3 | 56 | 🤗 | View | |
Llama 3.3 Nemotron Super 49B v1 (Reasoning) NVIDIA | 18 | 49B | 128k | - | - | 🤗 | - | View |
Llama 4 Maverick Meta | 18 | 402B (17B active at inference time) | 1.00M | $0.5 | 115 | 🤗 | +9 more | View |
Qwen3 Coder 30B A3B Instruct Alibaba | 17 | 30.5B (3.3B active at inference time) | 262k | $0.9 | 20 | 🤗 | +2 more | View |
ERNIE 4.5 300B A47B Baidu | 17 | 300B (47B active at inference time) | 131k | $0.5 | 20 | 🤗 | View | |
DeepSeek R1 Distill Qwen 32B DeepSeek | 17 | 32B | 128k | $0.3 | 56 | 🤗 | View | |
Hermes 4 - Llama-3.1 405B (Non-reasoning) Nous Research | 17 | 406B | 128k | $1.5 | 31 | 🤗 | View | |
DeepSeek V3 (Dec '24) DeepSeek | 17 | 671B (37B active at inference time) | 128k | $0.6 | - | 🤗 | +2 more | View |
Olmo 3 7B Think Allen Institute for AI | 17 | 7B | 65.5k | $0.1 | 68 | 🤗 | View | |
Magistral Small 1 Mistral | 17 | 23.6B | 40.0k | - | - | 🤗 | - | View |
Devstral Small 2 Mistral | 17 | 24B | 256k | - | 200 | 🤗 | View | |
Qwen3 VL 8B (Reasoning) Alibaba | 17 | 8.77B | 256k | $0.7 | 118 | 🤗 | View | |
K2-V2 (low) MBZUAI Institute of Foundation Models | 16 | 70B | 512k | - | - | 🤗 | - | View |
DeepSeek R1 0528 Qwen3 8B DeepSeek | 16 | 8.19B | 32.8k | - | - | 🤗 | - | View |
Ministral 3 14B Mistral | 16 | 14B | 256k | $0.2 | 142 | 🤗 | View | |
GLM-4.6V (Non-reasoning) Z AI | 16 | 108B | 128k | $0.5 | 27 | 🤗 | View | |
Qwen3 4B 2507 Instruct Alibaba | 16 | 4.02B | 262k | - | - | 🤗 | - | View |
EXAONE 4.0 32B (Non-reasoning) LG AI Research | 16 | 32B | 131k | $0.7 | 90 | 🤗 | View | |
Qwen3 Omni 30B A3B Instruct Alibaba | 16 | 35.3B (3B active at inference time) | 65.5k | $0.4 | 89 | 🤗 | View | |
Qwen3 235B A22B (Non-reasoning) Alibaba | 16 | 235B (22B active at inference time) | 32.8k | $1.2 | 35 | 🤗 | View | |
DeepSeek R1 Distill Llama 70B DeepSeek | 16 | 70B | 128k | $0.9 | 58 | 🤗 | View | |
DeepSeek R1 Distill Qwen 14B DeepSeek | 16 | 14B | 128k | - | - | 🤗 | - | View |
Qwen3 14B (Non-reasoning) Alibaba | 16 | 14.8B | 32.8k | $0.6 | 53 | 🤗 | View | |
Qwen2.5 Instruct 72B Alibaba | 16 | 72B | 131k | - | 46 | 🤗 | View | |
Qwen3 8B (Reasoning) Alibaba | 15 | 8.19B | 131k | $0.7 | 62 | 🤗 | View | |
Ministral 3 8B Mistral | 15 | 8B | 256k | $0.1 | 166 | 🤗 | View | |
Llama 3.1 Instruct 405B Meta | 15 | 405B | 128k | $4.4 | 25 | 🤗 | +4 more | View |
QwQ 32B-Preview Alibaba | 15 | 32.8B | 32.8k | $0.1 | 55 | 🤗 | View | |
Ling-mini-2.0 InclusionAI | 15 | 16.3B (1.4B active at inference time) | 131k | $0.1 | 148 | 🤗 | View | |
Mistral Small 3.2 Mistral | 15 | 24B | 128k | $0.1 | 147 | 🤗 | View | |
Devstral Small (Jul '25) Mistral | 15 | 24B | 256k | $0.1 | 233 | 🤗 | View | |
Qwen3 VL 8B Instruct Alibaba | 15 | 8.77B | 256k | $0.3 | 116 | 🤗 | View | |
NVIDIA Nemotron Nano 9B V2 (Reasoning) NVIDIA | 15 | 9B | 131k | $0.1 | 93 | 🤗 | View | |
Command A Cohere | 15 | 111B | 256k | $4.4 | 44 | 🤗 | View | |
Mistral Large 2 (Nov '24) Mistral | 15 | 123B | 128k | $3.0 | 35 | 🤗 | View | |
Exaone 4.0 1.2B (Reasoning) LG AI Research | 15 | 1.28B | 64.0k | - | - | 🤗 | - | View |
Llama Nemotron Super 49B v1.5 (Non-reasoning) NVIDIA | 15 | 49B | 128k | $0.2 | 68 | 🤗 | View | |
Qwen3 30B A3B (Non-reasoning) Alibaba | 15 | 30.5B (3.3B active at inference time) | 32.8k | $0.3 | 64 | 🤗 | View | |
Qwen3 32B (Non-reasoning) Alibaba | 15 | 32.8B | 32.8k | $1.2 | 78 | 🤗 | +5 more | View |
Llama 3.3 Instruct 70B Meta | 14 | 70B | 128k | $0.6 | 83 | 🤗 | +18 more | View |
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning) NVIDIA | 14 | 4.51B | 128k | - | - | 🤗 | - | View |
Kimi Linear 48B A3B Instruct Kimi | 14 | 49.1B (3B active at inference time) | 1.00M | - | - | 🤗 | - | View |
GLM-4.5V (Non-reasoning) Z AI | 14 | 108B (12B active at inference time) | 64.0k | $0.9 | 38 | 🤗 | View | |
Reka Flash 3 Reka AI | 14 | 21B | 128k | $0.3 | 49 | 🤗 | View | |
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) NVIDIA | 14 | 49B | 128k | - | - | 🤗 | - | View |
Qwen3 4B (Reasoning) Alibaba | 14 | 4.02B | 32.0k | $0.4 | 92 | 🤗 | View | |
NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning) NVIDIA | 14 | 31.6B (3.6B active at inference time) | 1.00M | $0.1 | 112 | 🤗 | View | |
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) NVIDIA | 14 | 13.2B | 128k | $0.3 | 134 | 🤗 | View | |
Llama 3.1 Tulu3 405B Allen Institute for AI | 14 | 405B | 128k | - | - | 🤗 | - | View |
Qwen3 VL 4B Instruct Alibaba | 14 | 4.44B | 256k | - | - | 🤗 | - | View |
Pixtral Large Mistral | 14 | 124B | 128k | $3.0 | 50 | 🤗 | View | |
Mistral Small 3.1 Mistral | 14 | 24B | 128k | $0.1 | 142 | 🤗 | +1 more | View |
Grok 2 (Dec '24) xAI | 14 | 270B | 131k | - | - | 🤗 | - | View |
Qwen3 VL 4B (Reasoning) Alibaba | 14 | 4.44B | 256k | - | - | 🤗 | - | View |
Hermes 4 - Llama-3.1 70B (Non-reasoning) Nous Research | 14 | 70.6B | 128k | $0.2 | 70 | 🤗 | View | |
Llama 4 Scout Meta | 14 | 109B (17B active at inference time) | 10.0M | $0.3 | 146 | 🤗 | +6 more | View |
Llama 3.1 Nemotron Instruct 70B NVIDIA | 14 | 70B | 128k | $1.2 | 31 | 🤗 | View | |
Qwen3 8B (Non-reasoning) Alibaba | 13 | 8.19B | 32.8k | $0.3 | 52 | 🤗 | View | |
Qwen2.5 Instruct 32B Alibaba | 13 | 32B | 128k | - | - | 🤗 | - | View |
Granite 4.0 H Small IBM | 13 | 32B (9B active at inference time) | 128k | $0.1 | 519 | 🤗 | View | |
Phi-4 Microsoft Azure | 13 | 14B | 16.0k | $0.2 | 8 | 🤗 | View | |
Llama 3.1 Instruct 70B Meta | 13 | 70B | 128k | $0.6 | 44 | 🤗 | +4 more | View |
Qwen3 1.7B (Reasoning) Alibaba | 13 | 2.03B | 32.0k | $0.4 | 124 | 🤗 | View | |
Mistral Large 2 (Jul '24) Mistral | 13 | 123B | 128k | $3.0 | - | 🤗 | View | |
Olmo 3 7B Instruct Allen Institute for AI | 13 | 7B | 65.5k | $0.1 | 38 | 🤗 | View | |
Qwen2.5 Coder Instruct 32B Alibaba | 13 | 32B | 131k | $0.2 | - | 🤗 | View | |
Ministral 3 3B Mistral | 13 | 3B | 256k | $0.1 | 292 | 🤗 | View | |
Mistral Small 3 Mistral | 13 | 24B | 32.0k | $0.1 | 241 | 🤗 | View | |
Jamba Reasoning 3B AI21 Labs | 13 | 3B | 262k | - | - | 🤗 | - | View |
Jamba 1.7 Large AI21 Labs | 13 | 398B (94B active at inference time) | 256k | $3.5 | 51 | 🤗 | View | |
DeepSeek-V2.5 (Dec '24) DeepSeek | 13 | 236B (21B active at inference time) | 128k | - | - | 🤗 | - | View |
Qwen3 4B (Non-reasoning) Alibaba | 12 | 4.02B | 32.0k | $0.2 | 84 | 🤗 | View | |
Exaone 4.0 1.2B (Non-reasoning) LG AI Research | 12 | 1.28B | 64.0k | - | - | 🤗 | - | View |
Gemma 3 12B Instruct Google | 12 | 12.2B | 128k | - | 32 | 🤗 | +2 more | View |
DeepSeek-V2.5 DeepSeek | 12 | 236B (21B active at inference time) | 128k | - | - | 🤗 | - | View |
Devstral Small (May '25) Mistral | 12 | 23.6B | 256k | $0.1 | - | 🤗 | View | |
DeepSeek R1 Distill Llama 8B DeepSeek | 12 | 8B | 128k | - | - | 🤗 | - | View |
R1 1776 Perplexity | 12 | 671B (37B active at inference time) | 128k | - | - | 🤗 | - | View |
Llama 3.2 Instruct 90B (Vision) Meta | 12 | 90B | 128k | $0.7 | 42 | 🤗 | +1 more | View |
Solar Mini Upstage | 12 | 10.7B | 4.10k | $0.1 | 82 | 🤗 | View | |
Grok-1 xAI | 12 | 314B (78B active at inference time) | 8.19k | - | - | 🤗 | - | View |
Qwen2 Instruct 72B Alibaba | 12 | 72B | 131k | - | - | 🤗 | - | View |
LFM2 8B A1B Liquid AI | 11 | 8.34B (1.5B active at inference time) | 32.8k | - | - | 🤗 | ? | View |
Llama 3.1 Instruct 8B Meta | 11 | 8B | 128k | $0.1 | 173 | 🤗 | +15 more | View |
Granite 4.0 Micro IBM | 11 | 3B | 128k | - | - | 🤗 | - | View |
Phi-4 Mini Instruct Microsoft Azure | 11 | 3.84B | 128k | - | 45 | 🤗 | View | |
DeepHermes 3 - Mistral 24B Preview (Non-reasoning) Nous Research | 11 | 24B | 32.0k | - | - | 🤗 | - | View |
Llama 3.2 Instruct 11B (Vision) Meta | 11 | 11B | 128k | $0.2 | 63 | 🤗 | View | |
Gemma 3n E4B Instruct Google | 11 | 8.39B (4B active at inference time) | 32.0k | $0.0 | 44 | 🤗 | View | |
Granite 3.3 8B (Non-reasoning) IBM | 11 | 8.17B | 128k | $0.1 | 486 | 🤗 | View | |
Jamba 1.5 Large AI21 Labs | 11 | 398B (94B active at inference time) | 256k | $3.5 | - | 🤗 | View | |
Jamba 1.7 Mini AI21 Labs | 11 | 52B (12B active at inference time) | 258k | - | - | 🤗 | - | View |
Gemma 3 4B Instruct Google | 11 | 4.3B | 128k | - | 32 | 🤗 | View | |
Hermes 3 - Llama-3.1 70B Nous Research | 11 | 70.6B | 128k | $0.3 | 32 | 🤗 | View | |
DeepSeek-Coder-V2 DeepSeek | 11 | 236B (21B active at inference time) | 128k | - | - | 🤗 | - | View |
Qwen3 1.7B (Non-reasoning) Alibaba | 11 | 2.03B | 32.0k | $0.2 | 117 | 🤗 | View | |
OLMo 2 32B Allen Institute for AI | 11 | 32.2B | 4.10k | - | - | 🤗 | - | View |
Jamba 1.6 Large AI21 Labs | 11 | 398B (94B active at inference time) | 256k | $3.5 | 55 | 🤗 | View | |
Qwen3 0.6B (Reasoning) Alibaba | 11 | 0.752B | 32.0k | $0.4 | 201 | 🤗 | View | |
LFM2 24B A2B Liquid AI | 10 | 23.8B (2.3B active at inference time) | 32.8k | $0.1 | 86 | 🤗 | View | |
Granite 4.0 H 1B IBM | 10 | 1.5B | 128k | - | - | 🤗 | - | View |
Gemma 3 27B Instruct Google | 10 | 27.4B | 128k | - | 34 | 🤗 | +2 more | View |
Granite 4.0 1B IBM | 10 | 1.6B | 128k | - | - | 🤗 | - | View |
Llama 3 Instruct 70B Meta | 10 | 70B | 8.19k | $0.9 | 38 | 🤗 | +1 more | View |
Mistral Small (Sep '24) Mistral | 10 | 22B | 32.8k | $0.3 | 147 | 🤗 | View | |
Phi-3 Mini Instruct 3.8B Microsoft Azure | 10 | 3.8B | 4.10k | $0.2 | - | 🤗 | View | |
Gemma 3n E4B Instruct Preview (May '25) Google | 10 | 8.39B (4B active at inference time) | 32.0k | - | - | 🤗 | - | View |
Phi-4 Multimodal Instruct Microsoft Azure | 10 | 5.6B | 128k | - | 17 | 🤗 | View | |
Qwen2.5 Coder Instruct 7B Alibaba | 10 | 7.62B | 131k | - | - | 🤗 | - | View |
LFM2 2.6B Liquid AI | 10 | 2.57B | 32.8k | - | - | 🤗 | ? | View |
Mixtral 8x22B Instruct Mistral | 10 | 141B (39B active at inference time) | 65.4k | - | - | 🤗 | - | View |
Llama 2 Chat 7B Meta | 10 | 7B | 4.10k | $0.1 | 117 | 🤗 | View | |
Gemma 3n E2B Instruct Google | 10 | 5.98B (2B active at inference time) | 32.0k | - | 47 | 🤗 | View | |
Llama 3.2 Instruct 3B Meta | 10 | 3B | 128k | $0.1 | 61 | 🤗 | +1 more | View |
Qwen3 0.6B (Non-reasoning) Alibaba | 10 | 0.752B | 32.0k | $0.2 | 193 | 🤗 | View | |
Qwen1.5 Chat 110B Alibaba | 10 | 110B | 32.0k | - | - | 🤗 | - | View |
LFM2 1.2B Liquid AI | 9 | 1.17B | 32.8k | - | - | 🤗 | ? | View |
OLMo 2 7B Allen Institute for AI | 9 | 7.3B | 4.10k | - | - | 🤗 | - | View |
Molmo 7B-D Allen Institute for AI | 9 | 8.02B | 4.10k | - | - | 🤗 | - | View |
Llama 3.2 Instruct 1B Meta | 9 | 1B | 128k | $0.1 | 139 | 🤗 | View | |
DeepSeek R1 Distill Qwen 1.5B DeepSeek | 9 | 1.5B | 128k | - | - | 🤗 | - | View |
DeepSeek-V2-Chat DeepSeek | 9 | 236B (21B active at inference time) | 128k | - | - | 🤗 | - | View |
Granite 4.0 H 350M IBM | 9 | 0.34B | 32.8k | - | - | 🤗 | - | View |
Granite 4.0 350M IBM | 9 | 0.35B | 32.8k | - | - | 🤗 | - | View |
Arctic Instruct Snowflake | 9 | 480B (17B active at inference time) | 4.00k | - | - | 🤗 | - | View |
Qwen Chat 72B Alibaba | 9 | 72B | 33.8k | - | - | 🤗 | - | View |
Llama 3 Instruct 8B Meta | 9 | 8B | 8.19k | $0.1 | 72 | 🤗 | +1 more | View |
Gemma 3 1B Instruct Google | 9 | 1B | 32.0k | - | 43 | 🤗 | View | |
DeepSeek Coder V2 Lite Instruct DeepSeek | 8 | 16B (2.4B active at inference time) | 128k | - | - | 🤗 | - | View |
Gemma 3 270M Google | 8 | 0.268B | 32.0k | - | - | 🤗 | - | View |
Llama 2 Chat 70B Meta | 8 | 70B | 4.10k | - | - | 🤗 | - | View |
DeepSeek LLM 67B Chat (V1) DeepSeek | 8 | 7B | 4.10k | - | - | 🤗 | - | View |
Llama 2 Chat 13B Meta | 8 | 13B | 4.10k | - | - | 🤗 | - | View |
Command-R+ (Apr '24) Cohere | 8 | 104B | 128k | $6.0 | - | 🤗 | View | |
OpenChat 3.5 (1210) OpenChat | 8 | 7B | 8.19k | - | - | 🤗 | - | View |
DBRX Instruct Databricks | 8 | 132B (36B active at inference time) | 32.8k | - | - | 🤗 | - | View |
Jamba 1.5 Mini AI21 Labs | 8 | 52B (12B active at inference time) | 256k | $0.3 | - | 🤗 | View | |
Jamba 1.6 Mini AI21 Labs | 8 | 52B (12B active at inference time) | 256k | $0.3 | 151 | 🤗 | View | |
Mixtral 8x7B Instruct Mistral | 8 | 46.7B (12.9B active at inference time) | 32.8k | $0.5 | - | 🤗 | View | |
DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning) Nous Research | 8 | 8B | 128k | - | - | 🤗 | - | View |
Llama 65B Meta | 7 | 65B | 2.05k | - | - | Not available | - | View |
Qwen Chat 14B Alibaba | 7 | 14B | 8.19k | - | - | Not available | - | View |
Mistral 7B Instruct Mistral | 7 | 7B | 8.19k | $0.3 | 148 | 🤗 | View | |
Command-R (Mar '24) Cohere | 7 | 35B | 128k | $0.8 | - | 🤗 | View | |
Falcon-H1R-7B TII UAE | - | 7B | 256k | - | - | Not available | - | View |
LFM2.5-VL-1.6B Liquid AI | - | 1.6B | 32.0k | - | - | 🤗 | - | View |
LFM2.5-1.2B-Thinking Liquid AI | - | 1.17B | 32.0k | - | - | 🤗 | - | View |
LFM2.5-1.2B-Instruct Liquid AI | - | 1.17B | 32.0k | - | - | 🤗 | ? | View |
Step3 VL 10B StepFun | - | 10.2B | 65.5k | - | - | 🤗 | - | View |
Molmo2-8B Allen Institute for AI | - | 8.66B | 36.9k | - | 113 | 🤗 | View | |
Olmo 3.1 32B Instruct Allen Institute for AI | - | 32.2B | 65.5k | $0.3 | 44 | 🤗 | View | |
Olmo 3.1 32B Think Allen Institute for AI | - | 32.2B | 65.5k | - | 74 | 🤗 | View | |
Cogito v2.1 (Reasoning) Deep Cogito | - | 671B (37B active at inference time) | 128k | $1.3 | 73 | 🤗 | View | |
Tri-21B-Think Trillion Labs | - | 21B | 32.0k | - | - | Not available | - | View |
Tri-21B-think Preview Trillion Labs | - | 21B | 32.0k | - | - | Not available | - | View |
Tiny Aya Global Cohere | - | 3.35B | 8.19k | - | - | 🤗 | - | View |