All evaluations

Artificial Analysis Openness Index

A composite measure providing an industry standard to communicate model openness for users and developers.

The Artificial Analysis Openness Index assesses how 'open' models are on the basis of their availability and transparency across different components (e.g. models weights, training data, and model architecture).
Availability represents the ability to use a model via API, self-hosting through open weights, and use freely with permissive licensing. Transparency captures the degree to which a model's methodology and data have been disclosed, shared, and permissively licensed for the community to use to understand a model's inputs and replicate or build on its approach.

All evaluations are conducted independently by Artificial Analysis. More information can be found on our Intelligence Benchmarking Methodology page.

Openness Index

Olmo 3.1 32B Instruct scores the highest on Openness Index with a score of 89, followed by Molmo 7B-D with a score of 89, and Olmo 3 7B Instruct with a score of 89

Openness Index

o3 scores the lowest on Openness Index with a score of 6, followed by Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) with a score of 6, and Gemini 2.5 Pro with a score of 6

Artificial Analysis Openness Index: Results

Openness Index assesses model openness on a 0 to 100 normalized scale (higher is more open)
Reasoning models are indicated by a lightbulb icon

Artificial Analysis Openness Index: Components

Openness Index underlying score contribution by components, up to a maximum of 18 (higher is more open)
Reasoning models are indicated by a lightbulb icon

Artificial Analysis Openness Index: Model Availability vs. Model Transparency

Model availability reflects the availability of a model for usage and associated license (maximum 6 points) · Model transparency reflects methodology and data disclosures, data sharing, and code and licensing associated with a model's training process (maximum 12 points)
Most attractive quadrant
Alibaba
Allen Institute for AI
Anthropic
DeepSeek
Google
Kimi
MBZUAI Institute of Foundation Models
MiniMax
Mistral
NVIDIA
OpenAI
Xiaomi
Z AI

Artificial Analysis Openness Index: Score vs. Release Date

Artificial Analysis Openness Index · Release date
Most attractive region
Alibaba
Allen Institute for AI
Anthropic
DeepSeek
Google
Kimi
MBZUAI Institute of Foundation Models
MiniMax
Mistral
NVIDIA
OpenAI
Xiaomi
Z AI

Artificial Analysis Openness Index vs. Artificial Analysis Intelligence Index

Artificial Analysis Openness Index · Artificial Analysis Intelligence Index
Most attractive quadrant
Alibaba
Allen Institute for AI
Anthropic
DeepSeek
Google
Kimi
MBZUAI Institute of Foundation Models
MiniMax
Mistral
NVIDIA
OpenAI
Xiaomi
Z AI

Openness Index Composition

Detailed methodology
1. Model availability
Weights
Access
0Closed weights, no API
1Closed weights, API limits token visibility
2Closed weights, API available
3Open weights
License
0Closed weights or no commercial use
1Commercial use, attribution required
2Commercial use, no attribution required
3Commercial use, no attribution required, no meaningful limitations
2. Model transparency
Data:Pre & Post Training(score represents average across each)
Access
0No or limited disclosure
1Partial data source detail and categorization disclosed
2Full data mix disclosure, substantial data shared¹
3Full data shared
License (most restrictive)
0No commercial use/no substantial data shared
1Commercial use, attribution required
2Commercial use, no attribution required
3Commercial use, no attribution required, no meaningful limitations
Methodology
Disclosure
0No or limited disclosure
1Model architecture disclosure
2Limited general technical disclosure
3Full technical details disclosed
License (most restrictive)
0No code disclosed/released
1Frameworks disclosed, openly available for commercial use
2End-to-end training pipeline code or guide released
3End-to-end training pipeline code or guide released, and commercial use allowed

Scoring methodology

Each component is scored on a 0-3 qualitative scale based on the best-fitting openness 'archetype', with each model assessed based on the full set of public first-party information available.

We synthesize these underlying factors into a unified metric, the Artificial Analysis Openness Index, as follows:

  • Data elements are averaged between pre- and post-training (to give a total of 6 possible points across data)
  • All component scores are added (up to a maximum of 18/18 points)
  • This score is normalized to a 0-100 scale

Where models are derived from a third-party base model, they may be constrained by the licensing or limited disclosure of the upstream model. For incremental/update releases, we only consider disclosures explicitly about the new release (including allowing model creators to declare which components remain consistent with an earlier release).

Openness Index Leaderboard

1
Allen Institute for AI logoAllen Institute for AI
Olmo 3.1 32B Instruct88.8912.166.0010.003.001.003.001.00
2
Allen Institute for AI logoAllen Institute for AI
Molmo 7B-D88.899.256.0010.003.001.003.001.00
3
Allen Institute for AI logoAllen Institute for AI
Olmo 3 7B Instruct88.898.156.0010.003.001.003.001.00
4
Allen Institute for AI logoAllen Institute for AI
Olmo 3.1 32B Think88.8913.946.0010.003.001.003.001.00
5
Allen Institute for AI logoAllen Institute for AI
Olmo 3 7B Think88.899.436.0010.003.001.003.001.00
6
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2 Think V288.8924.126.0010.003.001.003.001.00
7
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2-V2 (low)88.8914.446.0010.003.001.003.001.00
8
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2-V2 (high)88.8920.616.0010.003.001.003.001.00
9
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2-V2 (medium)88.8918.686.0010.003.001.003.001.00
10
Swiss AI Initiative logoSwiss AI Initiative
Apertus 70B Instruct88.897.706.0010.003.001.003.001.00
11
Swiss AI Initiative logoSwiss AI Initiative
Apertus 8B Instruct88.895.886.0010.003.001.003.001.00
12
Allen Institute for AI logoAllen Institute for AI
OLMo 2 7B88.899.306.0010.003.001.003.001.00
13
Allen Institute for AI logoAllen Institute for AI
Olmo 3 32B Think88.8912.096.0010.003.001.003.001.00
14
Allen Institute for AI logoAllen Institute for AI
OLMo 2 32B88.8910.576.0010.003.001.003.001.00
15
NVIDIA logoNVIDIA
NVIDIA Nemotron 3 Super 120B A12B (Reasoning)83.3335.976.009.002.001.002.001.00
16
NVIDIA logoNVIDIA
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)83.3324.276.009.002.001.002.001.00
17
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 9B V2 (Non-reasoning)72.2213.166.007.002.001.002.001.00
18
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 12B v2 VL (Reasoning)72.2214.896.007.002.001.002.001.00
19
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 9B V2 (Reasoning)72.2214.766.007.002.001.002.001.00
20
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)72.2210.096.007.002.001.002.001.00
21
Allen Institute for AI logoAllen Institute for AI
Molmo2-8B72.227.306.007.003.001.003.001.00
22
Kimi logoKimi
Kimi Linear 48B A3B Instruct61.1114.416.005.001.000.001.000.00
23
IBM logoIBM
Granite 4.1 3B61.118.546.005.002.001.002.001.00
24
IBM logoIBM
Granite 4.1 30B61.1114.696.005.002.001.002.001.00
25
IBM logoIBM
Granite 4.1 8B61.1112.386.005.002.001.002.001.00
26
IBM logoIBM
Granite 4.0 H 350M55.565.446.004.002.001.002.001.00
27
IBM logoIBM
Granite 4.0 H 1B55.567.996.004.002.001.002.001.00
28
IBM logoIBM
Granite 4.0 1B55.567.346.004.002.001.002.001.00
29
IBM logoIBM
Granite 4.0 H Small55.5610.816.004.002.001.002.001.00
30
IBM logoIBM
Granite 4.0 Micro55.567.676.004.002.001.002.001.00
31
IBM logoIBM
Granite 4.0 350M55.566.106.004.002.001.002.001.00
32
Baidu logoBaidu
ERNIE 4.5 300B A47B55.5614.966.004.000.000.000.000.00
33
Z AI logoZ AI
GLM-4.5 (Reasoning)55.5626.426.004.000.000.000.000.00
34
Z AI logoZ AI
GLM-4.5-Air55.5623.176.004.000.000.000.000.00
35
NVIDIA logoNVIDIA
Llama Nemotron Super 49B v1.5 (Non-reasoning)52.7814.594.005.501.000.001.001.00
36
NVIDIA logoNVIDIA
Llama 3.3 Nemotron Super 49B v1 (Reasoning)52.7818.494.005.501.000.001.001.00
37
NVIDIA logoNVIDIA
Llama Nemotron Super 49B v1.5 (Reasoning)52.7818.684.005.501.000.001.001.00
38
NVIDIA logoNVIDIA
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)52.7815.024.005.501.000.001.001.00
39
NVIDIA logoNVIDIA
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)52.7814.354.005.501.000.001.001.00
40
NVIDIA logoNVIDIA
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)52.7814.434.005.501.000.001.001.00
41
Xiaomi logoXiaomi
MiMo-V2-Flash (Reasoning)52.7839.246.003.500.000.001.000.00
42
Z AI logoZ AI
GLM-4.5V (Reasoning)52.7815.096.003.501.000.000.000.00
43
Z AI logoZ AI
GLM-4.5V (Non-reasoning)52.7812.746.003.501.000.000.000.00
44
Mistral logoMistral
Magistral Small 1.250.0018.166.003.000.000.001.001.00
45
DeepSeek logoDeepSeek
DeepSeek R1 0528 (May '25)50.0027.076.003.000.000.000.000.00
46
DeepSeek logoDeepSeek
DeepSeek V4 Flash (Reasoning, Max Effort)50.0046.526.003.000.000.000.000.00
47
DeepSeek logoDeepSeek
DeepSeek V4 Pro (Reasoning, High Effort)50.0049.796.003.000.000.000.000.00
48
DeepSeek logoDeepSeek
DeepSeek V4 Pro (Reasoning, Max Effort)50.0051.516.003.000.000.000.000.00
49
DeepSeek logoDeepSeek
DeepSeek V4 Flash (Reasoning, High Effort)50.0044.876.003.000.000.000.000.00
50
Microsoft logoMicrosoft
Phi-450.0010.416.003.001.000.001.000.00
51
Microsoft logoMicrosoft
Phi-4 Multimodal Instruct50.0010.046.003.001.000.001.000.00
52
Microsoft logoMicrosoft
Phi-4 Mini Instruct50.008.396.003.001.000.001.000.00
53
Z AI logoZ AI
GLM-5 (Reasoning)50.0049.776.003.000.000.000.000.00
54
Google logoGoogle
Gemma 3 1B Instruct50.005.556.003.000.000.000.000.00
55
Google logoGoogle
Gemma 3 4B Instruct50.006.306.003.000.000.000.000.00
56
Google logoGoogle
Gemma 3 27B Instruct50.0010.316.003.000.000.000.000.00
57
Google logoGoogle
Gemma 3 12B Instruct50.008.796.003.000.000.000.000.00
58
Google logoGoogle
Gemma 3n E2B Instruct50.004.766.003.000.000.000.000.00
59
Google logoGoogle
Gemma 3n E4B Instruct50.006.386.003.000.000.000.000.00
60
StepFun logoStepFun
Step 3.5 Flash50.0037.806.003.000.000.000.000.00
61
Alibaba logoAlibaba
Qwen3 VL 32B Instruct50.0017.196.003.001.000.001.000.00
62
Alibaba logoAlibaba
Qwen3 VL 235B A22B Instruct50.0020.756.003.001.000.001.000.00
63
Alibaba logoAlibaba
Qwen3 VL 30B A3B Instruct50.0016.056.003.001.000.001.000.00
64
Alibaba logoAlibaba
Qwen3 VL 30B A3B (Reasoning)50.0019.686.003.001.000.001.000.00
65
Alibaba logoAlibaba
Qwen3 VL 235B A22B (Reasoning)50.0027.646.003.001.000.001.000.00
66
Alibaba logoAlibaba
Qwen3 VL 4B (Reasoning)50.0013.736.003.001.000.001.000.00
67
Alibaba logoAlibaba
Qwen3 VL 4B Instruct50.009.556.003.001.000.001.000.00
68
Alibaba logoAlibaba
Qwen3 VL 8B (Reasoning)50.0016.666.003.001.000.001.000.00
69
Alibaba logoAlibaba
Qwen3 VL 8B Instruct50.0014.306.003.001.000.001.000.00
70
Alibaba logoAlibaba
Qwen3 VL 32B (Reasoning)50.0024.726.003.001.000.001.000.00
71
Nous Research logoNous Research
Hermes 4 - Llama-3.1 70B (Non-reasoning)47.2212.634.004.501.000.002.000.00
72
Nous Research logoNous Research
Hermes 4 - Llama-3.1 405B (Reasoning)47.2218.564.004.501.000.002.000.00
73
Nous Research logoNous Research
Hermes 4 - Llama-3.1 405B (Non-reasoning)47.2217.634.004.501.000.002.000.00
74
Nous Research logoNous Research
Hermes 4 - Llama-3.1 70B (Reasoning)47.2215.994.004.501.000.002.000.00
75
DeepSeek logoDeepSeek
DeepSeek R1 0528 Qwen3 8B47.2216.436.002.500.000.001.000.00
76
ServiceNow logoServiceNow
Apriel-v1.5-15B-Thinker47.2228.336.002.500.000.001.000.00
77
Google logoGoogle
Gemma 3 270M44.447.716.002.000.000.000.000.00
78
TII UAE logoTII UAE
Falcon-H1R-7B44.4415.804.004.001.000.001.000.00
79
NVIDIA logoNVIDIA
Llama 3.1 Nemotron Instruct 70B44.4413.444.004.000.000.001.001.00
80
LongCat logoLongCat
LongCat Flash Lite44.4423.936.002.000.000.000.000.00
81
OpenBMB logoOpenBMB
MiniCPM-V 4.6 1.3B44.4412.656.002.000.000.000.000.00
82
Arcee AI logoArcee AI
Trinity Large Thinking44.4431.876.002.000.000.000.000.00
83
Z AI logoZ AI
GLM-5.1 (Reasoning)44.4451.416.002.000.000.000.000.00
84
Alibaba logoAlibaba
Qwen3 Next 80B A3B (Reasoning)44.4426.726.002.000.000.000.000.00
85
Alibaba logoAlibaba
Qwen3 Omni 30B A3B (Reasoning)44.4415.626.002.000.000.000.000.00
86
Alibaba logoAlibaba
Qwen3 Next 80B A3B Instruct44.4420.116.002.000.000.000.000.00
87
Alibaba logoAlibaba
Qwen3 Omni 30B A3B Instruct44.4410.686.002.000.000.000.000.00
88
InclusionAI logoInclusionAI
Ling-flash-2.044.4415.746.002.000.000.000.000.00
89
InclusionAI logoInclusionAI
Ling-1T44.4419.046.002.000.000.000.000.00
90
InclusionAI logoInclusionAI
Ling-mini-2.044.449.196.002.000.000.000.000.00
91
Mistral logoMistral
Devstral Small (Jul '25)44.4415.216.002.000.000.000.000.00
92
DeepSeek logoDeepSeek
DeepSeek V3.2 Exp (Non-reasoning)44.4428.446.002.000.000.000.000.00
93
DeepSeek logoDeepSeek
DeepSeek V3.2 Exp (Reasoning)44.4432.946.002.000.000.000.000.00
94
Kimi logoKimi
Kimi K244.4426.324.004.001.000.001.000.00
95
Z AI logoZ AI
GLM-4.7 (Non-reasoning)44.4434.166.002.000.000.000.000.00
96
Z AI logoZ AI
GLM-4.7-Flash (Non-reasoning)44.4422.076.002.000.000.000.000.00
97
Z AI logoZ AI
GLM-4.7 (Reasoning)44.4442.116.002.000.000.000.000.00
98
Z AI logoZ AI
GLM-4.6 (Reasoning)44.4432.516.002.000.000.000.000.00
99
Z AI logoZ AI
GLM-4.6 (Non-reasoning)44.4430.246.002.000.000.000.000.00
100
Z AI logoZ AI
GLM-4.7-Flash (Reasoning)44.4430.156.002.000.000.000.000.00
101
Alibaba logoAlibaba
Qwen3 235B A22B 2507 (Reasoning)44.4429.546.002.000.000.000.000.00
102
Alibaba logoAlibaba
Qwen3 235B A22B 2507 Instruct44.4424.966.002.000.000.000.000.00
103
Alibaba logoAlibaba
Qwen3 Coder 480B A35B Instruct44.4424.776.002.000.000.000.000.00
104
Alibaba logoAlibaba
Qwen3 Coder 30B A3B Instruct44.4419.986.002.000.000.000.000.00
105
Alibaba logoAlibaba
Qwen3 30B A3B 2507 Instruct44.4415.006.002.000.000.000.000.00
106
Alibaba logoAlibaba
Qwen3 30B A3B 2507 (Reasoning)44.4422.416.002.000.000.000.000.00
107
Alibaba logoAlibaba
Qwen3 4B 2507 (Reasoning)44.4418.186.002.000.000.000.000.00
108
Alibaba logoAlibaba
Qwen3 4B 2507 Instruct44.4412.886.002.000.000.000.000.00
109
ByteDance Seed logoByteDance Seed
Seed-OSS-36B-Instruct44.4425.166.002.000.000.000.000.00
110
Alibaba logoAlibaba
Qwen3 Coder Next41.6728.286.001.500.000.001.000.00
111
OpenAI logoOpenAI
gpt-oss-120B (high)38.8933.276.001.000.000.000.000.00
112
OpenAI logoOpenAI
gpt-oss-20B (high)38.8924.476.001.000.000.000.000.00
113
Meta logoMeta
Llama 3.3 Instruct 70B38.8914.494.003.001.000.001.000.00
114
Meta logoMeta
Llama 3.1 Instruct 405B38.8917.384.003.001.000.001.000.00
115
Meta logoMeta
Llama 3.2 Instruct 90B (Vision)38.8911.904.003.001.000.001.000.00
116
Meta logoMeta
Llama 3.2 Instruct 11B (Vision)38.898.734.003.001.000.001.000.00
117
Google logoGoogle
Gemma 4 31B (Reasoning)38.8939.186.001.000.000.000.000.00
118
Google logoGoogle
Gemma 4 26B A4B (Reasoning)38.8931.216.001.000.000.000.000.00
119
Google logoGoogle
Gemma 4 E2B (Reasoning)38.8915.216.001.000.000.000.000.00
120
Google logoGoogle
Gemma 4 E4B (Reasoning)38.8918.766.001.000.000.000.000.00
121
Mistral logoMistral
Mistral Large 338.8922.806.001.000.000.000.000.00
122
Mistral logoMistral
Mistral Small 4 (Non-reasoning)38.8918.626.001.000.000.000.000.00
123
Mistral logoMistral
Mistral Small 4 (Reasoning)38.8927.806.001.000.000.000.000.00
124
Perplexity logoPerplexity
R1 177638.8911.996.001.000.000.000.000.00
125
Reka AI logoReka AI
Reka Flash 338.899.526.001.000.000.000.000.00
126
Nous Research logoNous Research
DeepHermes 3 - Mistral 24B Preview (Non-reasoning)38.8910.896.001.000.000.000.000.00
127
Xiaomi logoXiaomi
MiMo-V2.538.8949.036.001.000.000.000.000.00
128
Xiaomi logoXiaomi
MiMo-V2.5-Pro38.8953.836.001.000.000.000.000.00
129
Sarvam logoSarvam
Sarvam 105B (high)38.8918.166.001.000.000.000.000.00
130
Sarvam logoSarvam
Sarvam 30B (high)38.8912.346.001.000.000.000.000.00
131
Deep Cogito logoDeep Cogito
Cogito v2.1 (Reasoning)38.89-6.001.000.000.000.000.00
132
AI21 Labs logoAI21 Labs
Jamba Reasoning 3B38.899.606.001.000.000.000.000.00
133
Alibaba logoAlibaba
Qwen3.5 122B A10B (Reasoning)38.8941.606.001.000.000.000.000.00
134
Alibaba logoAlibaba
Qwen3.5 9B (Non-reasoning)38.8927.336.001.000.000.000.000.00
135
Alibaba logoAlibaba
Qwen3.5 397B A17B (Reasoning)38.8945.056.001.000.000.000.000.00
136
Alibaba logoAlibaba
Qwen3.6 27B (Reasoning)38.8945.826.001.000.000.000.000.00
137
Alibaba logoAlibaba
Qwen3.5 9B (Reasoning)38.8932.436.001.000.000.000.000.00
138
Alibaba logoAlibaba
Qwen3.5 2B (Non-reasoning)38.8914.676.001.000.000.000.000.00
139
Alibaba logoAlibaba
Qwen3.6 35B A3B (Reasoning)38.8943.496.001.000.000.000.000.00
140
Alibaba logoAlibaba
Qwen3.5 122B A10B (Non-reasoning)38.8935.876.001.000.000.000.000.00
141
Alibaba logoAlibaba
Qwen3.5 397B A17B (Non-reasoning)38.8940.106.001.000.000.000.000.00
142
Alibaba logoAlibaba
Qwen3.5 0.8B (Reasoning)38.8910.526.001.000.000.000.000.00
143
Alibaba logoAlibaba
Qwen3.5 4B (Reasoning)38.8927.086.001.000.000.000.000.00
144
Alibaba logoAlibaba
Qwen3.5 2B (Reasoning)38.8916.296.001.000.000.000.000.00
145
Alibaba logoAlibaba
Qwen3.5 0.8B (Non-reasoning)38.899.916.001.000.000.000.000.00
146
Alibaba logoAlibaba
Qwen3.5 4B (Non-reasoning)38.8922.606.001.000.000.000.000.00
147
Alibaba logoAlibaba
Qwen3.5 35B A3B (Non-reasoning)38.8930.696.001.000.000.000.000.00
148
InclusionAI logoInclusionAI
Ling 2.6 Flash38.8926.166.001.000.000.000.000.00
149
InclusionAI logoInclusionAI
Ring-1T38.8922.786.001.000.000.000.000.00
150
InclusionAI logoInclusionAI
Ling-2.6-1T38.8933.616.001.000.000.000.000.00
151
InclusionAI logoInclusionAI
Ring-flash-2.038.8914.026.001.000.000.000.000.00
152
Mistral logoMistral
Mistral Small 3.238.8915.076.001.000.000.000.000.00
153
DeepSeek logoDeepSeek
DeepSeek V3.1 Terminus (Non-reasoning)38.8928.526.001.000.000.000.000.00
154
DeepSeek logoDeepSeek
DeepSeek V3.1 Terminus (Reasoning)38.8933.936.001.000.000.000.000.00
155
Alibaba logoAlibaba
Qwen3.5 35B A3B (Reasoning)38.8937.126.001.000.000.000.000.00
156
Alibaba logoAlibaba
Qwen3.5 27B (Reasoning)38.8942.076.001.000.000.000.000.00
157
Alibaba logoAlibaba
Qwen3.5 27B (Non-reasoning)38.8937.186.001.000.000.000.000.00
158
DeepSeek logoDeepSeek
DeepSeek R1 Distill Llama 70B36.1115.954.002.500.000.001.000.00
159
Mistral logoMistral
Mistral Medium 3.533.3339.235.001.000.000.000.000.00
160
Liquid AI logoLiquid AI
LFM2 2.6B33.338.044.002.000.000.000.000.00
161
Liquid AI logoLiquid AI
LFM2 8B A1B33.337.034.002.000.000.000.000.00
162
Kimi logoKimi
Kimi K2.633.3353.904.002.000.000.000.000.00
163
Nous Research logoNous Research
DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)33.337.585.001.000.000.000.000.00
164
Tencent logoTencent
Hy3-preview (Reasoning)33.3341.855.001.000.000.000.000.00
165
Tencent logoTencent
Hy3-preview (Non-reasoning)33.3333.665.001.000.000.000.000.00
166
Cohere logoCohere
Command A33.3313.483.003.000.000.000.000.00
167
Liquid AI logoLiquid AI
LFM2 1.2B33.336.334.002.000.000.000.000.00
168
Kimi logoKimi
Kimi K2.5 (Reasoning)33.3346.814.002.000.000.000.000.00
169
Naver logoNaver
HyperCLOVA X SEED Think (32B)30.5623.724.001.501.000.000.000.00
170
Meta logoMeta
Llama 4 Scout27.7813.524.001.000.000.000.000.00
171
Meta logoMeta
Llama 4 Maverick27.7818.364.001.000.000.000.000.00
172
Mistral logoMistral
Magistral Medium 1.227.7827.102.003.000.000.001.001.00
173
Liquid AI logoLiquid AI
LFM2.5-1.2B-Thinking27.788.084.001.000.000.000.000.00
174
Liquid AI logoLiquid AI
LFM2.5-1.2B-Instruct27.788.044.001.000.000.000.000.00
175
Liquid AI logoLiquid AI
LFM2 24B A2B27.7810.494.001.000.000.000.000.00
176
Liquid AI logoLiquid AI
LFM2.5-VL-1.6B27.786.184.001.000.000.000.000.00
177
LG AI Research logoLG AI Research
Exaone 4.0 1.2B (Non-reasoning)27.788.113.002.000.000.000.000.00
178
LG AI Research logoLG AI Research
Exaone 4.0 1.2B (Reasoning)27.788.263.002.000.000.000.000.00
179
LG AI Research logoLG AI Research
K-EXAONE (Reasoning)27.7832.124.001.000.000.000.000.00
180
LG AI Research logoLG AI Research
EXAONE 4.0 32B (Reasoning)27.7816.683.002.000.000.000.000.00
181
LG AI Research logoLG AI Research
EXAONE 4.0 32B (Non-reasoning)27.7811.663.002.000.000.000.000.00
182
MiniMax logoMiniMax
MiniMax-M2.127.7839.424.001.000.000.000.000.00
183
MiniMax logoMiniMax
MiniMax-M2.527.7841.934.001.000.000.000.000.00
184
MiniMax logoMiniMax
MiniMax-M227.7836.094.001.000.000.000.000.00
185
Kimi logoKimi
Kimi K2 090527.7830.854.001.000.000.000.000.00
186
Kimi logoKimi
Kimi K2 Thinking27.7840.894.001.000.000.000.000.00
187
MiniMax logoMiniMax
MiniMax-M2.722.2249.623.001.000.000.000.000.00
188
AI21 Labs logoAI21 Labs
Jamba 1.7 Mini22.228.074.000.000.000.000.000.00
189
AI21 Labs logoAI21 Labs
Jamba 1.7 Large22.2210.884.000.000.000.000.000.00
190
Alibaba logoAlibaba
Qwen3 Max16.6731.382.001.000.000.000.000.00
191
Alibaba logoAlibaba
Qwen3 Max Thinking (Preview)16.6732.482.001.000.000.000.000.00
192
Google logoGoogle
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)11.1119.422.000.000.000.000.000.00
193
Anthropic logoAnthropic
Claude 4.5 Haiku (Non-reasoning)11.1131.052.000.000.000.000.000.00
194
Anthropic logoAnthropic
Claude 4.5 Haiku (Reasoning)11.1137.092.000.000.000.000.000.00
195
xAI logoxAI
Grok 3 mini Reasoning (high)11.1132.082.000.000.000.000.000.00
196
xAI logoxAI
Grok 4.1 Fast (Non-reasoning)11.1123.562.000.000.000.000.000.00
197
Amazon logoAmazon
Nova Micro11.1110.272.000.000.000.000.000.00
198
Amazon logoAmazon
Nova Premier11.1119.012.000.000.000.000.000.00
199
Upstage logoUpstage
Solar Pro 2 (Reasoning)11.1114.922.000.000.000.000.000.00
200
Upstage logoUpstage
Solar Pro 2 (Non-reasoning)11.1113.592.000.000.000.000.000.00
201
ByteDance Seed logoByteDance Seed
Doubao Seed Code11.1133.522.000.000.000.000.000.00
202
OpenAI logoOpenAI
GPT-5.1 (Non-reasoning)11.1127.422.000.000.000.000.000.00
203
OpenAI logoOpenAI
GPT-5 (ChatGPT)11.1121.832.000.000.000.000.000.00
204
Google logoGoogle
Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)11.1125.702.000.000.000.000.000.00
205
Anthropic logoAnthropic
Claude Opus 4.5 (Non-reasoning)11.1143.092.000.000.000.000.000.00
206
Anthropic logoAnthropic
Claude 4.5 Sonnet (Non-reasoning)11.1137.142.000.000.000.000.000.00
207
Anthropic logoAnthropic
Claude 4.5 Sonnet (Reasoning)11.1143.032.000.000.000.000.000.00
208
Anthropic logoAnthropic
Claude Opus 4.5 (Reasoning)11.1149.732.000.000.000.000.000.00
209
Mistral logoMistral
Devstral Medium11.1118.662.000.000.000.000.000.00
210
Mistral logoMistral
Mistral Medium 3.111.1121.252.000.000.000.000.000.00
211
xAI logoxAI
Grok 4 Fast (Non-reasoning)11.1123.122.000.000.000.000.000.00
212
Amazon logoAmazon
Nova Pro11.1113.482.000.000.000.000.000.00
213
Amazon logoAmazon
Nova Lite11.1112.652.000.000.000.000.000.00
214
OpenAI logoOpenAI
o35.5638.371.000.000.000.000.000.00
215
Google logoGoogle
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)5.5621.651.000.000.000.000.000.00
216
Google logoGoogle
Gemini 2.5 Pro5.5634.631.000.000.000.000.000.00
217
xAI logoxAI
Grok 4.1 Fast (Reasoning)5.5638.611.000.000.000.000.000.00
218
xAI logoxAI
Grok Code Fast 15.5628.741.000.000.000.000.000.00
219
OpenAI logoOpenAI
GPT-5 (minimal)5.5623.891.000.000.000.000.000.00
220
OpenAI logoOpenAI
GPT-5 (medium)5.5642.031.000.000.000.000.000.00
221
OpenAI logoOpenAI
GPT-5 Codex (high)5.5644.631.000.000.000.000.000.00
222
OpenAI logoOpenAI
GPT-5.1 (high)5.5647.701.000.000.000.000.000.00
223
OpenAI logoOpenAI
GPT-5 (high)5.5644.631.000.000.000.000.000.00
224
OpenAI logoOpenAI
GPT-5 mini (high)5.5641.171.000.000.000.000.000.00
225
OpenAI logoOpenAI
GPT-5 nano (high)5.5626.831.000.000.000.000.000.00
226
OpenAI logoOpenAI
GPT-5 nano (minimal)5.5613.841.000.000.000.000.000.00
227
OpenAI logoOpenAI
GPT-5 (low)5.5639.201.000.000.000.000.000.00
228
OpenAI logoOpenAI
GPT-5 nano (medium)5.5625.881.000.000.000.000.000.00
229
OpenAI logoOpenAI
GPT-5 mini (minimal)5.5620.681.000.000.000.000.000.00
230
OpenAI logoOpenAI
GPT-5 mini (medium)5.5638.941.000.000.000.000.000.00
231
Google logoGoogle
Gemini 2.5 Flash Preview (Sep '25) (Reasoning)5.5631.141.000.000.000.000.000.00
232
Google logoGoogle
Gemini 3 Pro Preview (high)5.5648.391.000.000.000.000.000.00
233
xAI logoxAI
Grok 4 Fast (Reasoning)5.5635.061.000.000.000.000.000.00
234
xAI logoxAI
Grok 45.5641.521.000.000.000.000.000.00

Explore Evaluations

Artificial Analysis Intelligence IndexArtificial Analysis Intelligence Index

A composite benchmark aggregating ten challenging evaluations to provide a holistic measure of AI capabilities across mathematics, science, coding, and reasoning.

GDPval-AA LeaderboardGDPval-AA Leaderboard

GDPval-AA is Artificial Analysis' evaluation framework for OpenAI's GDPval dataset. It tests AI models on real-world tasks across 44 occupations and 9 major industries. Models are given shell access and web browsing capabilities in an agentic loop via Stirrup to solve tasks, with Elo ratings derived from blind pairwise comparisons.

APEX-Agents-AA Benchmark LeaderboardAPEX-Agents-AA Benchmark Leaderboard

Artificial Analysis' implementation of the APEX-Agents benchmark, testing AI agents on long-horizon, cross-application tasks in professional-services environments with realistic application tooling.

𝜏²-Bench Telecom Benchmark Leaderboard𝜏²-Bench Telecom Benchmark Leaderboard

A dual-control conversational AI benchmark simulating technical support scenarios where both agent and user must coordinate actions to resolve telecom service issues.

Terminal-Bench Hard Benchmark LeaderboardTerminal-Bench Hard Benchmark Leaderboard

An agentic benchmark evaluating AI capabilities in terminal environments through software engineering, system administration, and data processing tasks.

SciCode Benchmark LeaderboardSciCode Benchmark Leaderboard

A scientist-curated coding benchmark featuring 288 test set subproblems from 80 laboratory problems across 16 scientific disciplines.

Artificial Analysis Long Context Reasoning Benchmark LeaderboardArtificial Analysis Long Context Reasoning Benchmark Leaderboard

A challenging benchmark measuring language models' ability to extract, reason about, and synthesize information from long-form documents ranging from 10k to 100k tokens (measured using the cl100k_base tokenizer).

AA-Omniscience: Knowledge and Hallucination BenchmarkAA-Omniscience: Knowledge and Hallucination Benchmark

A benchmark measuring factual recall and hallucination across various economically relevant domains.

IFBench Benchmark LeaderboardIFBench Benchmark Leaderboard

A benchmark evaluating precise instruction-following generalization on 58 diverse, verifiable out-of-domain constraints that test models' ability to follow specific output requirements.

Humanity's Last Exam Benchmark LeaderboardHumanity's Last Exam Benchmark Leaderboard

A frontier-level benchmark with 2,500 expert-vetted questions across mathematics, sciences, and humanities, designed to be the final closed-ended academic evaluation.

GPQA Diamond Benchmark Leaderboard

The most challenging 198 questions from GPQA, where PhD experts achieve 65% accuracy but skilled non-experts only reach 34% despite web access.

CritPt Benchmark LeaderboardCritPt Benchmark Leaderboard

A benchmark designed to test LLMs on research-level physics reasoning tasks, featuring 71 composite research challenges.

Artificial Analysis Openness IndexArtificial Analysis Openness Index

A composite measure providing an industry standard to communicate model openness for users and developers.

MMLU-Pro Benchmark LeaderboardMMLU-Pro Benchmark Leaderboard

An enhanced version of MMLU with 12,000 graduate-level questions across 14 subject areas, featuring ten answer options and deeper reasoning requirements.

Global-MMLU-Lite Benchmark LeaderboardGlobal-MMLU-Lite Benchmark Leaderboard

A lightweight, multilingual version of MMLU, designed to evaluate knowledge and reasoning skills across a diverse range of languages and cultural contexts.

LiveCodeBench Benchmark LeaderboardLiveCodeBench Benchmark Leaderboard

A contamination-free coding benchmark that continuously harvests fresh competitive programming problems from LeetCode, AtCoder, and CodeForces, evaluating code generation, self-repair, and execution.

MATH-500 Benchmark LeaderboardMATH-500 Benchmark Leaderboard

A 500-problem subset from the MATH dataset, featuring competition-level mathematics across six domains including algebra, geometry, and number theory.

AIME 2025 Benchmark LeaderboardAIME 2025 Benchmark Leaderboard

All 30 problems from the 2025 American Invitational Mathematics Examination, testing olympiad-level mathematical reasoning with integer answers from 000-999.

MMMU-Pro Benchmark LeaderboardMMMU-Pro Benchmark Leaderboard

An enhanced MMMU benchmark that eliminates shortcuts and guessing strategies to more rigorously test multimodal models across 30 academic disciplines.