All evaluations

Artificial Analysis Openness Index

A composite measure providing an industry standard to communicate model openness for users and developers.

The Artificial Analysis Openness Index assesses how 'open' models are on the basis of their availability and transparency across different components (e.g. models weights, training data, and model architecture).
Availability represents the ability to use a model via API, self-hosting through open weights, and use freely with permissive licensing. Transparency captures the degree to which a model's methodology and data have been disclosed, shared, and permissively licensed for the community to use to understand a model's inputs and replicate or build on its approach.

All evaluations are conducted independently by Artificial Analysis. More information can be found on our Intelligence Benchmarking Methodology page.

Openness Index

Olmo 3.1 32B Instruct scores the highest on Openness Index with a score of 89, followed by Olmo 3 7B Think with a score of 89, and Olmo 3.1 32B Think with a score of 89

Openness Index

o3 scores the lowest on Openness Index with a score of 6, followed by Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) with a score of 6, and Gemini 2.5 Pro with a score of 6

Artificial Analysis Openness Index: Results

Openness Index assesses model openness on a 0 to 100 normalized scale (higher is more open)

Artificial Analysis Openness Index: Components

Openness Index underlying score contribution by components, up to a maximum of 18 (higher is more open)
Model Availability
Transparency - Methodology
Transparency - Post-training Data
Transparency - Pre-training Data

Artificial Analysis Openness Index: Model Availability vs. Model Transparency

Model Availability reflects the availability of a model for usage and associated license (maximum 6 points); Model Transparency reflects methodology and data disclosures, data sharing, and code and licensing associated with a model's training process (maximum 12 points)
Most attractive quadrant
Alibaba
Allen Institute for AI
Anthropic
Google
LG AI Research
MBZUAI Institute of Foundation Models
MiniMax
Mistral
NVIDIA
OpenAI
Z AI

Artificial Analysis Openness Index: Score vs. Release Date

Artificial Analysis Openness Index; Release Date
Most attractive region
Alibaba
Allen Institute for AI
Anthropic
Google
LG AI Research
MBZUAI Institute of Foundation Models
MiniMax
Mistral
NVIDIA
OpenAI
Z AI

Artificial Analysis Openness Index vs. Artificial Analysis Intelligence Index

Artificial Analysis Openness Index; Artificial Analysis Intelligence Index
Most attractive quadrant
Alibaba
Allen Institute for AI
Anthropic
Google
LG AI Research
MBZUAI Institute of Foundation Models
MiniMax
Mistral
NVIDIA
OpenAI
Z AI

Openness Index Composition

Detailed methodology
1. Model availability
Weights
Access
0Closed weights, no API
1Closed weights, API limits token visibility
2Closed weights, API available
3Open weights
License
0Closed weights or no commercial use
1Commercial use, attribution required
2Commercial use, no attribution required
3Commercial use, no attribution required, no meaningful limitations
2. Model transparency
Data:Pre & Post Training(score represents average across each)
Access
0No or limited disclosure
1Partial data source detail and categorization disclosed
2Full data mix disclosure, substantial data shared¹
3Full data shared
License (most restrictive)
0No commercial use/no substantial data shared
1Commercial use, attribution required
2Commercial use, no attribution required
3Commercial use, no attribution required, no meaningful limitations
Methodology
Disclosure
0No or limited disclosure
1Model architecture disclosure
2Limited general technical disclosure
3Full technical details disclosed
License (most restrictive)
0No code disclosed/released
1Frameworks disclosed, openly available for commercial use
2End-to-end training pipeline code or guide released
3End-to-end training pipeline code or guide released, and commercial use allowed

Scoring methodology

Each component is scored on a 0-3 qualitative scale based on the best-fitting openness 'archetype', with each model assessed based on the full set of public first-party information available.

We synthesize these underlying factors into a unified metric, the Artificial Analysis Openness Index, as follows:

  • Data elements are averaged between pre- and post-training (to give a total of 6 possible points across data)
  • All component scores are added (up to a maximum of 18/18 points)
  • This score is normalized to a 0-100 scale

Where models are derived from a third-party base model, they may be constrained by the licensing or limited disclosure of the upstream model. For incremental/update releases, we only consider disclosures explicitly about the new release (including allowing model creators to declare which components remain consistent with an earlier release).

Openness Index Leaderboard

1
Allen Institute for AI logoAllen Institute for AI
Olmo 3.1 32B Instruct88.8912.166.0010.003.001.003.001.00
2
Allen Institute for AI logoAllen Institute for AI
Olmo 3 7B Think88.899.436.0010.003.001.003.001.00
3
Allen Institute for AI logoAllen Institute for AI
Olmo 3.1 32B Think88.8913.946.0010.003.001.003.001.00
4
Allen Institute for AI logoAllen Institute for AI
Molmo 7B-D88.899.256.0010.003.001.003.001.00
5
Allen Institute for AI logoAllen Institute for AI
Olmo 3 7B Instruct88.898.156.0010.003.001.003.001.00
6
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2 Think V288.8924.126.0010.003.001.003.001.00
7
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2-V2 (medium)88.8918.686.0010.003.001.003.001.00
8
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2-V2 (low)88.8914.446.0010.003.001.003.001.00
9
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2-V2 (high)88.8920.616.0010.003.001.003.001.00
10
Swiss AI Initiative logoSwiss AI Initiative
Apertus 8B Instruct88.895.886.0010.003.001.003.001.00
11
Swiss AI Initiative logoSwiss AI Initiative
Apertus 70B Instruct88.897.706.0010.003.001.003.001.00
12
Allen Institute for AI logoAllen Institute for AI
Olmo 3 32B Think88.8912.096.0010.003.001.003.001.00
13
Allen Institute for AI logoAllen Institute for AI
OLMo 2 7B88.899.306.0010.003.001.003.001.00
14
Allen Institute for AI logoAllen Institute for AI
OLMo 2 32B88.8910.576.0010.003.001.003.001.00
15
NVIDIA logoNVIDIA
NVIDIA Nemotron 3 Super 120B A12B (Reasoning)83.3335.976.009.002.001.002.001.00
16
NVIDIA logoNVIDIA
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)83.3324.276.009.002.001.002.001.00
17
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)72.2210.096.007.002.001.002.001.00
18
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 9B V2 (Non-reasoning)72.2213.166.007.002.001.002.001.00
19
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 9B V2 (Reasoning)72.2214.766.007.002.001.002.001.00
20
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 12B v2 VL (Reasoning)72.2214.896.007.002.001.002.001.00
21
Allen Institute for AI logoAllen Institute for AI
Molmo2-8B72.227.306.007.003.001.003.001.00
22
Kimi logoKimi
Kimi Linear 48B A3B Instruct61.1114.416.005.001.000.001.000.00
23
IBM logoIBM
Granite 4.0 1B55.567.346.004.002.001.002.001.00
24
IBM logoIBM
Granite 4.0 H 1B55.567.996.004.002.001.002.001.00
25
IBM logoIBM
Granite 4.0 350M55.566.106.004.002.001.002.001.00
26
IBM logoIBM
Granite 4.0 H Small55.5610.816.004.002.001.002.001.00
27
IBM logoIBM
Granite 4.0 Micro55.567.676.004.002.001.002.001.00
28
IBM logoIBM
Granite 4.0 H 350M55.565.446.004.002.001.002.001.00
29
Baidu logoBaidu
ERNIE 4.5 300B A47B55.5614.966.004.000.000.000.000.00
30
Z AI logoZ AI
GLM-4.5 (Reasoning)55.5626.426.004.000.000.000.000.00
31
Z AI logoZ AI
GLM-4.5-Air55.5623.176.004.000.000.000.000.00
32
NVIDIA logoNVIDIA
Llama Nemotron Super 49B v1.5 (Non-reasoning)52.7814.594.005.501.000.001.001.00
33
NVIDIA logoNVIDIA
Llama 3.3 Nemotron Super 49B v1 (Reasoning)52.7818.494.005.501.000.001.001.00
34
NVIDIA logoNVIDIA
Llama Nemotron Super 49B v1.5 (Reasoning)52.7818.684.005.501.000.001.001.00
35
NVIDIA logoNVIDIA
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)52.7814.354.005.501.000.001.001.00
36
NVIDIA logoNVIDIA
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)52.7815.024.005.501.000.001.001.00
37
NVIDIA logoNVIDIA
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)52.7814.434.005.501.000.001.001.00
38
Xiaomi logoXiaomi
MiMo-V2-Flash (Reasoning)52.7839.246.003.500.000.001.000.00
39
Z AI logoZ AI
GLM-4.5V (Reasoning)52.7815.096.003.501.000.000.000.00
40
Z AI logoZ AI
GLM-4.5V (Non-reasoning)52.7812.746.003.501.000.000.000.00
41
Mistral logoMistral
Magistral Small 1.250.0018.166.003.000.000.001.001.00
42
DeepSeek logoDeepSeek
DeepSeek R1 0528 (May '25)50.0027.076.003.000.000.000.000.00
43
Microsoft Azure logoMicrosoft Azure
Phi-450.0010.416.003.001.000.001.000.00
44
Microsoft Azure logoMicrosoft Azure
Phi-4 Mini Instruct50.008.396.003.001.000.001.000.00
45
Microsoft Azure logoMicrosoft Azure
Phi-4 Multimodal Instruct50.0010.046.003.001.000.001.000.00
46
StepFun logoStepFun
Step 3.5 Flash50.0037.806.003.000.000.000.000.00
47
Z AI logoZ AI
GLM-5 (Reasoning)50.0049.776.003.000.000.000.000.00
48
Google logoGoogle
Gemma 3n E2B Instruct50.004.766.003.000.000.000.000.00
49
Google logoGoogle
Gemma 3 1B Instruct50.005.556.003.000.000.000.000.00
50
Google logoGoogle
Gemma 3 4B Instruct50.006.306.003.000.000.000.000.00
51
Google logoGoogle
Gemma 3 27B Instruct50.0010.316.003.000.000.000.000.00
52
Google logoGoogle
Gemma 3 12B Instruct50.008.796.003.000.000.000.000.00
53
Google logoGoogle
Gemma 3n E4B Instruct50.006.386.003.000.000.000.000.00
54
Alibaba logoAlibaba
Qwen3 VL 30B A3B Instruct50.0016.056.003.001.000.001.000.00
55
Alibaba logoAlibaba
Qwen3 VL 32B Instruct50.0017.196.003.001.000.001.000.00
56
Alibaba logoAlibaba
Qwen3 VL 235B A22B Instruct50.0020.756.003.001.000.001.000.00
57
Alibaba logoAlibaba
Qwen3 VL 235B A22B (Reasoning)50.0027.646.003.001.000.001.000.00
58
Alibaba logoAlibaba
Qwen3 VL 30B A3B (Reasoning)50.0019.686.003.001.000.001.000.00
59
Alibaba logoAlibaba
Qwen3 VL 8B Instruct50.0014.306.003.001.000.001.000.00
60
Alibaba logoAlibaba
Qwen3 VL 4B Instruct50.009.556.003.001.000.001.000.00
61
Alibaba logoAlibaba
Qwen3 VL 4B (Reasoning)50.0013.736.003.001.000.001.000.00
62
Alibaba logoAlibaba
Qwen3 VL 8B (Reasoning)50.0016.666.003.001.000.001.000.00
63
Alibaba logoAlibaba
Qwen3 VL 32B (Reasoning)50.0024.726.003.001.000.001.000.00
64
DeepSeek logoDeepSeek
DeepSeek R1 0528 Qwen3 8B47.2216.436.002.500.000.001.000.00
65
Nous Research logoNous Research
Hermes 4 - Llama-3.1 70B (Reasoning)47.2215.994.004.501.000.002.000.00
66
Nous Research logoNous Research
Hermes 4 - Llama-3.1 405B (Reasoning)47.2218.564.004.501.000.002.000.00
67
Nous Research logoNous Research
Hermes 4 - Llama-3.1 70B (Non-reasoning)47.2212.634.004.501.000.002.000.00
68
Nous Research logoNous Research
Hermes 4 - Llama-3.1 405B (Non-reasoning)47.2217.634.004.501.000.002.000.00
69
ServiceNow logoServiceNow
Apriel-v1.5-15B-Thinker47.2228.336.002.500.000.001.000.00
70
Google logoGoogle
Gemma 3 270M44.447.716.002.000.000.000.000.00
71
TII UAE logoTII UAE
Falcon-H1R-7B44.4415.804.004.001.000.001.000.00
72
NVIDIA logoNVIDIA
Llama 3.1 Nemotron Instruct 70B44.4413.444.004.000.000.001.001.00
73
LongCat logoLongCat
LongCat Flash Lite44.4423.936.002.000.000.000.000.00
74
Arcee AI logoArcee AI
Trinity Large Thinking44.4431.876.002.000.000.000.000.00
75
Z AI logoZ AI
GLM-5.1 (Reasoning)44.4451.416.002.000.000.000.000.00
76
Alibaba logoAlibaba
Qwen3 Next 80B A3B Instruct44.4420.116.002.000.000.000.000.00
77
Alibaba logoAlibaba
Qwen3 Next 80B A3B (Reasoning)44.4426.726.002.000.000.000.000.00
78
Alibaba logoAlibaba
Qwen3 Coder 480B A35B Instruct44.4424.776.002.000.000.000.000.00
79
Alibaba logoAlibaba
Qwen3 Omni 30B A3B Instruct44.4410.686.002.000.000.000.000.00
80
Alibaba logoAlibaba
Qwen3 Omni 30B A3B (Reasoning)44.4415.626.002.000.000.000.000.00
81
InclusionAI logoInclusionAI
Ling-mini-2.044.449.196.002.000.000.000.000.00
82
InclusionAI logoInclusionAI
Ling-1T44.4419.046.002.000.000.000.000.00
83
InclusionAI logoInclusionAI
Ling-flash-2.044.4415.746.002.000.000.000.000.00
84
Mistral logoMistral
Devstral Small (Jul '25)44.4415.216.002.000.000.000.000.00
85
DeepSeek logoDeepSeek
DeepSeek V3.2 Exp (Reasoning)44.4432.946.002.000.000.000.000.00
86
DeepSeek logoDeepSeek
DeepSeek V3.2 Exp (Non-reasoning)44.4428.446.002.000.000.000.000.00
87
Kimi logoKimi
Kimi K244.4426.324.004.001.000.001.000.00
88
Z AI logoZ AI
GLM-4.7 (Reasoning)44.4442.116.002.000.000.000.000.00
89
Z AI logoZ AI
GLM-4.6 (Non-reasoning)44.4430.246.002.000.000.000.000.00
90
Z AI logoZ AI
GLM-4.7 (Non-reasoning)44.4434.166.002.000.000.000.000.00
91
Z AI logoZ AI
GLM-4.6 (Reasoning)44.4432.516.002.000.000.000.000.00
92
Z AI logoZ AI
GLM-4.7-Flash (Reasoning)44.4430.156.002.000.000.000.000.00
93
Z AI logoZ AI
GLM-4.7-Flash (Non-reasoning)44.4422.076.002.000.000.000.000.00
94
Alibaba logoAlibaba
Qwen3 235B A22B 2507 Instruct44.4424.966.002.000.000.000.000.00
95
Alibaba logoAlibaba
Qwen3 30B A3B 2507 (Reasoning)44.4422.416.002.000.000.000.000.00
96
Alibaba logoAlibaba
Qwen3 30B A3B 2507 Instruct44.4415.006.002.000.000.000.000.00
97
Alibaba logoAlibaba
Qwen3 235B A22B 2507 (Reasoning)44.4429.546.002.000.000.000.000.00
98
Alibaba logoAlibaba
Qwen3 4B 2507 (Reasoning)44.4418.186.002.000.000.000.000.00
99
Alibaba logoAlibaba
Qwen3 Coder 30B A3B Instruct44.4419.986.002.000.000.000.000.00
100
Alibaba logoAlibaba
Qwen3 4B 2507 Instruct44.4412.886.002.000.000.000.000.00
101
ByteDance Seed logoByteDance Seed
Seed-OSS-36B-Instruct44.4425.166.002.000.000.000.000.00
102
Alibaba logoAlibaba
Qwen3 Coder Next41.6728.286.001.500.000.001.000.00
103
OpenAI logoOpenAI
gpt-oss-20B (high)38.8924.476.001.000.000.000.000.00
104
OpenAI logoOpenAI
gpt-oss-120B (high)38.8933.276.001.000.000.000.000.00
105
Meta logoMeta
Llama 3.3 Instruct 70B38.8914.494.003.001.000.001.000.00
106
Meta logoMeta
Llama 3.1 Instruct 405B38.8917.384.003.001.000.001.000.00
107
Meta logoMeta
Llama 3.2 Instruct 90B (Vision)38.8911.904.003.001.000.001.000.00
108
Meta logoMeta
Llama 3.2 Instruct 11B (Vision)38.898.734.003.001.000.001.000.00
109
Google logoGoogle
Gemma 4 31B (Reasoning)38.8939.186.001.000.000.000.000.00
110
Google logoGoogle
Gemma 4 26B A4B (Reasoning)38.8931.216.001.000.000.000.000.00
111
Google logoGoogle
Gemma 4 E4B (Reasoning)38.8918.766.001.000.000.000.000.00
112
Google logoGoogle
Gemma 4 E2B (Reasoning)38.8915.216.001.000.000.000.000.00
113
Mistral logoMistral
Mistral Small 4 (Reasoning)38.8927.806.001.000.000.000.000.00
114
Mistral logoMistral
Mistral Small 4 (Non-reasoning)38.8918.626.001.000.000.000.000.00
115
Mistral logoMistral
Mistral Large 338.8922.806.001.000.000.000.000.00
116
Perplexity logoPerplexity
R1 177638.8911.996.001.000.000.000.000.00
117
Reka AI logoReka AI
Reka Flash 338.899.526.001.000.000.000.000.00
118
Nous Research logoNous Research
DeepHermes 3 - Mistral 24B Preview (Non-reasoning)38.8910.896.001.000.000.000.000.00
119
Sarvam logoSarvam
Sarvam 30B (high)38.8912.346.001.000.000.000.000.00
120
Sarvam logoSarvam
Sarvam 105B (high)38.8918.166.001.000.000.000.000.00
121
Deep Cogito logoDeep Cogito
Cogito v2.1 (Reasoning)38.89-6.001.000.000.000.000.00
122
AI21 Labs logoAI21 Labs
Jamba Reasoning 3B38.899.606.001.000.000.000.000.00
123
Alibaba logoAlibaba
Qwen3.5 27B (Reasoning)38.8942.076.001.000.000.000.000.00
124
Alibaba logoAlibaba
Qwen3.5 35B A3B (Non-reasoning)38.8930.696.001.000.000.000.000.00
125
Alibaba logoAlibaba
Qwen3.5 27B (Non-reasoning)38.8937.186.001.000.000.000.000.00
126
Alibaba logoAlibaba
Qwen3.5 35B A3B (Reasoning)38.8937.126.001.000.000.000.000.00
127
Alibaba logoAlibaba
Qwen3.5 397B A17B (Non-reasoning)38.8940.106.001.000.000.000.000.00
128
Alibaba logoAlibaba
Qwen3.5 397B A17B (Reasoning)38.8945.056.001.000.000.000.000.00
129
Alibaba logoAlibaba
Qwen3.5 122B A10B (Non-reasoning)38.8935.876.001.000.000.000.000.00
130
Alibaba logoAlibaba
Qwen3.5 0.8B (Reasoning)38.8910.526.001.000.000.000.000.00
131
Alibaba logoAlibaba
Qwen3.5 4B (Reasoning)38.8927.086.001.000.000.000.000.00
132
Alibaba logoAlibaba
Qwen3.5 2B (Reasoning)38.8916.296.001.000.000.000.000.00
133
Alibaba logoAlibaba
Qwen3.5 0.8B (Non-reasoning)38.899.916.001.000.000.000.000.00
134
Alibaba logoAlibaba
Qwen3.5 4B (Non-reasoning)38.8922.606.001.000.000.000.000.00
135
Alibaba logoAlibaba
Qwen3.5 9B (Reasoning)38.8932.436.001.000.000.000.000.00
136
Alibaba logoAlibaba
Qwen3.5 9B (Non-reasoning)38.8927.336.001.000.000.000.000.00
137
Alibaba logoAlibaba
Qwen3.5 2B (Non-reasoning)38.8914.676.001.000.000.000.000.00
138
Alibaba logoAlibaba
Qwen3.5 122B A10B (Reasoning)38.8941.606.001.000.000.000.000.00
139
InclusionAI logoInclusionAI
Ring-1T38.8922.786.001.000.000.000.000.00
140
InclusionAI logoInclusionAI
Ring-flash-2.038.8914.026.001.000.000.000.000.00
141
Mistral logoMistral
Mistral Small 3.238.8915.076.001.000.000.000.000.00
142
DeepSeek logoDeepSeek
DeepSeek V3.1 Terminus (Non-reasoning)38.8928.526.001.000.000.000.000.00
143
DeepSeek logoDeepSeek
DeepSeek V3.1 Terminus (Reasoning)38.8933.936.001.000.000.000.000.00
144
DeepSeek logoDeepSeek
DeepSeek R1 Distill Llama 70B36.1115.954.002.500.000.001.000.00
145
Liquid AI logoLiquid AI
LFM2 2.6B33.338.044.002.000.000.000.000.00
146
Liquid AI logoLiquid AI
LFM2 8B A1B33.337.034.002.000.000.000.000.00
147
Kimi logoKimi
Kimi K2.5 (Reasoning)33.3346.814.002.000.000.000.000.00
148
Nous Research logoNous Research
DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)33.337.585.001.000.000.000.000.00
149
Cohere logoCohere
Command A33.3313.483.003.000.000.000.000.00
150
Liquid AI logoLiquid AI
LFM2 1.2B33.336.334.002.000.000.000.000.00
151
Naver logoNaver
HyperCLOVA X SEED Think (32B)30.5623.724.001.501.000.000.000.00
152
Meta logoMeta
Llama 4 Scout27.7813.524.001.000.000.000.000.00
153
Meta logoMeta
Llama 4 Maverick27.7818.364.001.000.000.000.000.00
154
Mistral logoMistral
Magistral Medium 1.227.7827.102.003.000.000.001.001.00
155
Liquid AI logoLiquid AI
LFM2 24B A2B27.7810.494.001.000.000.000.000.00
156
Liquid AI logoLiquid AI
LFM2.5-VL-1.6B27.786.184.001.000.000.000.000.00
157
Liquid AI logoLiquid AI
LFM2.5-1.2B-Thinking27.788.084.001.000.000.000.000.00
158
Liquid AI logoLiquid AI
LFM2.5-1.2B-Instruct27.788.044.001.000.000.000.000.00
159
LG AI Research logoLG AI Research
Exaone 4.0 1.2B (Reasoning)27.788.263.002.000.000.000.000.00
160
LG AI Research logoLG AI Research
K-EXAONE (Reasoning)27.7832.124.001.000.000.000.000.00
161
LG AI Research logoLG AI Research
Exaone 4.0 1.2B (Non-reasoning)27.788.113.002.000.000.000.000.00
162
LG AI Research logoLG AI Research
EXAONE 4.0 32B (Non-reasoning)27.7811.663.002.000.000.000.000.00
163
LG AI Research logoLG AI Research
EXAONE 4.0 32B (Reasoning)27.7816.683.002.000.000.000.000.00
164
MiniMax logoMiniMax
MiniMax-M2.527.7841.934.001.000.000.000.000.00
165
MiniMax logoMiniMax
MiniMax-M227.7836.094.001.000.000.000.000.00
166
MiniMax logoMiniMax
MiniMax-M2.127.7839.424.001.000.000.000.000.00
167
Kimi logoKimi
Kimi K2 Thinking27.7840.894.001.000.000.000.000.00
168
Kimi logoKimi
Kimi K2 090527.7830.854.001.000.000.000.000.00
169
MiniMax logoMiniMax
MiniMax-M2.722.2249.623.001.000.000.000.000.00
170
AI21 Labs logoAI21 Labs
Jamba 1.7 Large22.2210.884.000.000.000.000.000.00
171
AI21 Labs logoAI21 Labs
Jamba 1.7 Mini22.228.074.000.000.000.000.000.00
172
Alibaba logoAlibaba
Qwen3 Max16.6731.382.001.000.000.000.000.00
173
Alibaba logoAlibaba
Qwen3 Max Thinking (Preview)16.6732.482.001.000.000.000.000.00
174
Google logoGoogle
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)11.1119.422.000.000.000.000.000.00
175
Anthropic logoAnthropic
Claude 4.5 Haiku (Reasoning)11.1137.092.000.000.000.000.000.00
176
Anthropic logoAnthropic
Claude 4.5 Haiku (Non-reasoning)11.1131.052.000.000.000.000.000.00
177
Mistral logoMistral
Mistral Medium 3.111.1121.252.000.000.000.000.000.00
178
xAI logoxAI
Grok 3 mini Reasoning (high)11.1132.082.000.000.000.000.000.00
179
xAI logoxAI
Grok 4.1 Fast (Non-reasoning)11.1123.562.000.000.000.000.000.00
180
Amazon logoAmazon
Nova Micro11.1110.272.000.000.000.000.000.00
181
Amazon logoAmazon
Nova Premier11.1119.012.000.000.000.000.000.00
182
Upstage logoUpstage
Solar Pro 2 (Non-reasoning)11.1113.592.000.000.000.000.000.00
183
Upstage logoUpstage
Solar Pro 2 (Reasoning)11.1114.922.000.000.000.000.000.00
184
ByteDance Seed logoByteDance Seed
Doubao Seed Code11.1133.522.000.000.000.000.000.00
185
OpenAI logoOpenAI
GPT-5.1 (Non-reasoning)11.1127.422.000.000.000.000.000.00
186
OpenAI logoOpenAI
GPT-5 (ChatGPT)11.1121.832.000.000.000.000.000.00
187
Google logoGoogle
Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)11.1125.702.000.000.000.000.000.00
188
Anthropic logoAnthropic
Claude Opus 4.5 (Non-reasoning)11.1143.092.000.000.000.000.000.00
189
Anthropic logoAnthropic
Claude Opus 4.5 (Reasoning)11.1149.732.000.000.000.000.000.00
190
Anthropic logoAnthropic
Claude 4.5 Sonnet (Reasoning)11.1143.032.000.000.000.000.000.00
191
Anthropic logoAnthropic
Claude 4.5 Sonnet (Non-reasoning)11.1137.142.000.000.000.000.000.00
192
Mistral logoMistral
Devstral Medium11.1118.662.000.000.000.000.000.00
193
xAI logoxAI
Grok 4 Fast (Non-reasoning)11.1123.122.000.000.000.000.000.00
194
Amazon logoAmazon
Nova Pro11.1113.482.000.000.000.000.000.00
195
Amazon logoAmazon
Nova Lite11.1112.652.000.000.000.000.000.00
196
OpenAI logoOpenAI
o35.5638.371.000.000.000.000.000.00
197
Google logoGoogle
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)5.5621.651.000.000.000.000.000.00
198
Google logoGoogle
Gemini 2.5 Pro5.5634.631.000.000.000.000.000.00
199
xAI logoxAI
Grok Code Fast 15.5628.741.000.000.000.000.000.00
200
xAI logoxAI
Grok 4.1 Fast (Reasoning)5.5638.611.000.000.000.000.000.00
201
OpenAI logoOpenAI
GPT-5 (high)5.5644.631.000.000.000.000.000.00
202
OpenAI logoOpenAI
GPT-5 Codex (high)5.5644.631.000.000.000.000.000.00
203
OpenAI logoOpenAI
GPT-5 (medium)5.5642.031.000.000.000.000.000.00
204
OpenAI logoOpenAI
GPT-5 (low)5.5639.201.000.000.000.000.000.00
205
OpenAI logoOpenAI
GPT-5 mini (minimal)5.5620.681.000.000.000.000.000.00
206
OpenAI logoOpenAI
GPT-5 nano (medium)5.5625.881.000.000.000.000.000.00
207
OpenAI logoOpenAI
GPT-5 mini (medium)5.5638.941.000.000.000.000.000.00
208
OpenAI logoOpenAI
GPT-5 mini (high)5.5641.171.000.000.000.000.000.00
209
OpenAI logoOpenAI
GPT-5 nano (minimal)5.5613.841.000.000.000.000.000.00
210
OpenAI logoOpenAI
GPT-5 (minimal)5.5623.891.000.000.000.000.000.00
211
OpenAI logoOpenAI
GPT-5 nano (high)5.5626.831.000.000.000.000.000.00
212
OpenAI logoOpenAI
GPT-5.1 (high)5.5647.701.000.000.000.000.000.00
213
Google logoGoogle
Gemini 3 Pro Preview (high)5.5648.391.000.000.000.000.000.00
214
Google logoGoogle
Gemini 2.5 Flash Preview (Sep '25) (Reasoning)5.5631.141.000.000.000.000.000.00
215
xAI logoxAI
Grok 4 Fast (Reasoning)5.5635.061.000.000.000.000.000.00
216
xAI logoxAI
Grok 45.5641.521.000.000.000.000.000.00

Explore Evaluations

Artificial Analysis Intelligence IndexArtificial Analysis Intelligence Index

A composite benchmark aggregating ten challenging evaluations to provide a holistic measure of AI capabilities across mathematics, science, coding, and reasoning.

GDPval-AA LeaderboardGDPval-AA Leaderboard

GDPval-AA is Artificial Analysis' evaluation framework for OpenAI's GDPval dataset. It tests AI models on real-world tasks across 44 occupations and 9 major industries. Models are given shell access and web browsing capabilities in an agentic loop via Stirrup to solve tasks, with ELO ratings derived from blind pairwise comparisons.

APEX-Agents-AA Benchmark LeaderboardAPEX-Agents-AA Benchmark Leaderboard

Artificial Analysis' implementation of the APEX-Agents benchmark, testing AI agents on long-horizon, cross-application tasks in professional-services environments with realistic application tooling.

𝜏²-Bench Telecom Benchmark Leaderboard𝜏²-Bench Telecom Benchmark Leaderboard

A dual-control conversational AI benchmark simulating technical support scenarios where both agent and user must coordinate actions to resolve telecom service issues.

Terminal-Bench Hard Benchmark LeaderboardTerminal-Bench Hard Benchmark Leaderboard

An agentic benchmark evaluating AI capabilities in terminal environments through software engineering, system administration, and data processing tasks.

SciCode Benchmark LeaderboardSciCode Benchmark Leaderboard

A scientist-curated coding benchmark featuring 288 test set subproblems from 80 laboratory problems across 16 scientific disciplines.

Artificial Analysis Long Context Reasoning Benchmark LeaderboardArtificial Analysis Long Context Reasoning Benchmark Leaderboard

A challenging benchmark measuring language models' ability to extract, reason about, and synthesize information from long-form documents ranging from 10k to 100k tokens (measured using the cl100k_base tokenizer).

AA-Omniscience: Knowledge and Hallucination BenchmarkAA-Omniscience: Knowledge and Hallucination Benchmark

A benchmark measuring factual recall and hallucination across various economically relevant domains.

IFBench Benchmark LeaderboardIFBench Benchmark Leaderboard

A benchmark evaluating precise instruction-following generalization on 58 diverse, verifiable out-of-domain constraints that test models' ability to follow specific output requirements.

Humanity's Last Exam Benchmark LeaderboardHumanity's Last Exam Benchmark Leaderboard

A frontier-level benchmark with 2,500 expert-vetted questions across mathematics, sciences, and humanities, designed to be the final closed-ended academic evaluation.

GPQA Diamond Benchmark Leaderboard

The most challenging 198 questions from GPQA, where PhD experts achieve 65% accuracy but skilled non-experts only reach 34% despite web access.

CritPt Benchmark LeaderboardCritPt Benchmark Leaderboard

A benchmark designed to test LLMs on research-level physics reasoning tasks, featuring 71 composite research challenges.

Artificial Analysis Openness IndexArtificial Analysis Openness Index

A composite measure providing an industry standard to communicate model openness for users and developers.

MMLU-Pro Benchmark LeaderboardMMLU-Pro Benchmark Leaderboard

An enhanced version of MMLU with 12,000 graduate-level questions across 14 subject areas, featuring ten answer options and deeper reasoning requirements.

Global-MMLU-Lite Benchmark LeaderboardGlobal-MMLU-Lite Benchmark Leaderboard

A lightweight, multilingual version of MMLU, designed to evaluate knowledge and reasoning skills across a diverse range of languages and cultural contexts.

LiveCodeBench Benchmark LeaderboardLiveCodeBench Benchmark Leaderboard

A contamination-free coding benchmark that continuously harvests fresh competitive programming problems from LeetCode, AtCoder, and CodeForces, evaluating code generation, self-repair, and execution.

MATH-500 Benchmark LeaderboardMATH-500 Benchmark Leaderboard

A 500-problem subset from the MATH dataset, featuring competition-level mathematics across six domains including algebra, geometry, and number theory.

AIME 2025 Benchmark LeaderboardAIME 2025 Benchmark Leaderboard

All 30 problems from the 2025 American Invitational Mathematics Examination, testing olympiad-level mathematical reasoning with integer answers from 000-999.

MMMU-Pro Benchmark LeaderboardMMMU-Pro Benchmark Leaderboard

An enhanced MMMU benchmark that eliminates shortcuts and guessing strategies to more rigorously test multimodal models across 30 academic disciplines.