Stay connected with us on X, Discord, and LinkedIn to stay up to date with future analysis
All evaluations

Artificial Analysis Openness Index

A composite measure providing an industry standard to communicate model openness for users and developers.

Background

The Artificial Analysis Openness Index assesses how 'open' models are on the basis of their availability and transparency across different components (e.g. models weights, training data, and model architecture).
Availability represents the ability to use a model via API, self-hosting through open weights, and use freely with permissive licensing. Transparency captures the degree to which a model's methodology and data have been disclosed, shared, and permissively licensed for the community to use to understand a model's inputs and replicate or build on its approach.

Methodology

All evaluations are conducted independently by Artificial Analysis. More information can be found on our Intelligence Benchmarking Methodology page.

Highlights

  • Olmo 3.1 32B Instruct scores the highest on Openness Index with a score of 89, followed by Olmo 3 7B Think with a score of 89, and Olmo 3 7B Instruct with a score of 89
  • o3 scores the lowest on Openness Index with a score of 6, followed by Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) with a score of 6, and Gemini 2.5 Pro with a score of 6

Artificial Analysis Openness Index: Results

Openness Index assesses model openness on a 0 to 100 normalized scale (higher is more open)
+ Add model from specific provider

Artificial Analysis Openness Index: Components

Openness Index underlying score contribution by components, up to a maximum of 18 (higher is more open)
+ Add model from specific provider
Model Availability
Transparency - Methodology
Transparency - Post-training Data
Transparency - Pre-training Data

Artificial Analysis Openness Index: Model Availability vs. Model Transparency

Model Availability reflects the availability of a model for usage and associated license (maximum 6 points); Model Transparency reflects methodology and data disclosures, data sharing, and code and licensing associated with a model's training process (maximum 12 points)
+ Add model from specific provider
Most attractive quadrant
Alibaba
Allen Institute for AI
Anthropic
Kimi
LG AI Research
MBZUAI Institute of Foundation Models
Meta
Mistral
NVIDIA
OpenAI
Z AI

Artificial Analysis Openness Index: Score vs. Release Date

Artificial Analysis Openness Index; Release Date
+ Add model from specific provider
Most attractive region
Alibaba
Allen Institute for AI
Anthropic
Kimi
LG AI Research
MBZUAI Institute of Foundation Models
Meta
Mistral
NVIDIA
OpenAI
Z AI

Artificial Analysis Openness Index vs. Artificial Analysis Intelligence Index

Artificial Analysis Openness Index; Artificial Analysis Intelligence Index
+ Add model from specific provider
Most attractive quadrant
Alibaba
Allen Institute for AI
Anthropic
Kimi
LG AI Research
MBZUAI Institute of Foundation Models
Meta
Mistral
NVIDIA
OpenAI
Z AI

Openness Index Composition

Detailed methodology
1. Model availability
Weights
Access
0Closed weights, no API
1Closed weights, API limits token visibility
2Closed weights, API available
3Open weights
License
0Closed weights or no commercial use
1Commercial use, attribution required
2Commercial use, no attribution required
3Commercial use, no attribution required, no meaningful limitations
2. Model transparency
Data:Pre & Post Training(score represents average across each)
Access
0No or limited disclosure
1Partial data source detail and categorization disclosed
2Full data mix disclosure, substantial data shared¹
3Full data shared
License (most restrictive)
0No commercial use/no substantial data shared
1Commercial use, attribution required
2Commercial use, no attribution required
3Commercial use, no attribution required, no meaningful limitations
Methodology
Disclosure
0No or limited disclosure
1Model architecture disclosure
2Limited general technical disclosure
3Full technical details disclosed
License (most restrictive)
0No code disclosed/released
1Frameworks disclosed, openly available for commercial use
2End-to-end training pipeline code or guide released
3End-to-end training pipeline code or guide released, and commercial use allowed

Scoring methodology

Each component is scored on a 0-3 qualitative scale based on the best-fitting openness 'archetype', with each model assessed based on the full set of public first-party information available.

We synthesize these underlying factors into a unified metric, the Artificial Analysis Openness Index, as follows:

  • Data elements are averaged between pre- and post-training (to give a total of 6 possible points across data)
  • All component scores are added (up to a maximum of 18/18 points)
  • This score is normalized to a 0-100 scale

Where models are derived from a third-party base model, they may be constrained by the licensing or limited disclosure of the upstream model. For incremental/update releases, we only consider disclosures explicitly about the new release (including allowing model creators to declare which components remain consistent with an earlier release).

Openness Index Leaderboard

1
Allen Institute for AI logoAllen Institute for AI
Olmo 3.1 32B Instruct88.8912.166.0010.003.001.003.001.00
2
Allen Institute for AI logoAllen Institute for AI
Olmo 3 7B Think88.899.436.0010.003.001.003.001.00
3
Allen Institute for AI logoAllen Institute for AI
Olmo 3 7B Instruct88.898.156.0010.003.001.003.001.00
4
Allen Institute for AI logoAllen Institute for AI
Molmo 7B-D88.899.256.0010.003.001.003.001.00
5
Allen Institute for AI logoAllen Institute for AI
Olmo 3.1 32B Think88.8913.946.0010.003.001.003.001.00
6
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2-V2 (high)88.8920.616.0010.003.001.003.001.00
7
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2-V2 (low)88.8914.446.0010.003.001.003.001.00
8
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2 Think V288.8924.126.0010.003.001.003.001.00
9
MBZUAI Institute of Foundation Models logoMBZUAI Institute of Foundation Models
K2-V2 (medium)88.8918.686.0010.003.001.003.001.00
10
Swiss AI Initiative logoSwiss AI Initiative
Apertus 70B Instruct88.897.706.0010.003.001.003.001.00
11
Swiss AI Initiative logoSwiss AI Initiative
Apertus 8B Instruct88.895.886.0010.003.001.003.001.00
12
Allen Institute for AI logoAllen Institute for AI
Olmo 3 32B Think88.8912.096.0010.003.001.003.001.00
13
Allen Institute for AI logoAllen Institute for AI
OLMo 2 7B88.899.306.0010.003.001.003.001.00
14
Allen Institute for AI logoAllen Institute for AI
OLMo 2 32B88.8910.576.0010.003.001.003.001.00
15
NVIDIA logoNVIDIA
NVIDIA Nemotron 3 Super 120B A12B (Reasoning)83.3335.976.009.002.001.002.001.00
16
NVIDIA logoNVIDIA
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)83.3324.276.009.002.001.002.001.00
17
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 9B V2 (Non-reasoning)72.2213.166.007.002.001.002.001.00
18
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 12B v2 VL (Reasoning)72.2214.896.007.002.001.002.001.00
19
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)72.2210.096.007.002.001.002.001.00
20
NVIDIA logoNVIDIA
NVIDIA Nemotron Nano 9B V2 (Reasoning)72.2214.766.007.002.001.002.001.00
21
Allen Institute for AI logoAllen Institute for AI
Molmo2-8B72.227.306.007.003.001.003.001.00
22
Kimi logoKimi
Kimi Linear 48B A3B Instruct61.1114.416.005.001.000.001.000.00
23
IBM logoIBM
Granite 4.0 H 1B55.567.996.004.002.001.002.001.00
24
IBM logoIBM
Granite 4.0 1B55.567.346.004.002.001.002.001.00
25
IBM logoIBM
Granite 4.0 Micro55.567.676.004.002.001.002.001.00
26
IBM logoIBM
Granite 4.0 H 350M55.565.446.004.002.001.002.001.00
27
IBM logoIBM
Granite 4.0 350M55.566.106.004.002.001.002.001.00
28
IBM logoIBM
Granite 4.0 H Small55.5610.816.004.002.001.002.001.00
29
Baidu logoBaidu
ERNIE 4.5 300B A47B55.5614.966.004.000.000.000.000.00
30
Z AI logoZ AI
GLM-4.5-Air55.5623.176.004.000.000.000.000.00
31
Z AI logoZ AI
GLM-4.5 (Reasoning)55.5626.426.004.000.000.000.000.00
32
NVIDIA logoNVIDIA
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)52.7814.434.005.501.000.001.001.00
33
NVIDIA logoNVIDIA
Llama Nemotron Super 49B v1.5 (Non-reasoning)52.7814.594.005.501.000.001.001.00
34
NVIDIA logoNVIDIA
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)52.7815.024.005.501.000.001.001.00
35
NVIDIA logoNVIDIA
Llama 3.3 Nemotron Super 49B v1 (Reasoning)52.7818.494.005.501.000.001.001.00
36
NVIDIA logoNVIDIA
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)52.7814.354.005.501.000.001.001.00
37
NVIDIA logoNVIDIA
Llama Nemotron Super 49B v1.5 (Reasoning)52.7818.684.005.501.000.001.001.00
38
Xiaomi logoXiaomi
MiMo-V2-Flash (Reasoning)52.7839.246.003.500.000.001.000.00
39
Z AI logoZ AI
GLM-4.5V (Non-reasoning)52.7812.746.003.501.000.000.000.00
40
Z AI logoZ AI
GLM-4.5V (Reasoning)52.7815.096.003.501.000.000.000.00
41
Google logoGoogle
Gemma 3n E4B Instruct50.006.386.003.000.000.000.000.00
42
Google logoGoogle
Gemma 3 12B Instruct50.008.796.003.000.000.000.000.00
43
Google logoGoogle
Gemma 3 4B Instruct50.006.306.003.000.000.000.000.00
44
Google logoGoogle
Gemma 3 27B Instruct50.0010.316.003.000.000.000.000.00
45
Google logoGoogle
Gemma 3 1B Instruct50.005.556.003.000.000.000.000.00
46
Google logoGoogle
Gemma 3n E2B Instruct50.004.766.003.000.000.000.000.00
47
Mistral logoMistral
Magistral Small 1.250.0018.166.003.000.000.001.001.00
48
DeepSeek logoDeepSeek
DeepSeek R1 0528 (May '25)50.0027.076.003.000.000.000.000.00
49
Microsoft Azure logoMicrosoft Azure
Phi-450.0010.416.003.001.000.001.000.00
50
Microsoft Azure logoMicrosoft Azure
Phi-4 Mini Instruct50.008.396.003.001.000.001.000.00
51
Microsoft Azure logoMicrosoft Azure
Phi-4 Multimodal Instruct50.0010.046.003.001.000.001.000.00
52
StepFun logoStepFun
Step 3.5 Flash50.0037.806.003.000.000.000.000.00
53
Z AI logoZ AI
GLM-5 (Reasoning)50.0049.776.003.000.000.000.000.00
54
Alibaba logoAlibaba
Qwen3 VL 30B A3B Instruct50.0016.056.003.001.000.001.000.00
55
Alibaba logoAlibaba
Qwen3 VL 30B A3B (Reasoning)50.0019.686.003.001.000.001.000.00
56
Alibaba logoAlibaba
Qwen3 VL 32B (Reasoning)50.0024.726.003.001.000.001.000.00
57
Alibaba logoAlibaba
Qwen3 VL 32B Instruct50.0017.196.003.001.000.001.000.00
58
Alibaba logoAlibaba
Qwen3 VL 235B A22B (Reasoning)50.0027.646.003.001.000.001.000.00
59
Alibaba logoAlibaba
Qwen3 VL 8B (Reasoning)50.0016.666.003.001.000.001.000.00
60
Alibaba logoAlibaba
Qwen3 VL 235B A22B Instruct50.0020.756.003.001.000.001.000.00
61
Alibaba logoAlibaba
Qwen3 VL 4B Instruct50.009.556.003.001.000.001.000.00
62
Alibaba logoAlibaba
Qwen3 VL 4B (Reasoning)50.0013.736.003.001.000.001.000.00
63
Alibaba logoAlibaba
Qwen3 VL 8B Instruct50.0014.306.003.001.000.001.000.00
64
DeepSeek logoDeepSeek
DeepSeek R1 0528 Qwen3 8B47.2216.436.002.500.000.001.000.00
65
Nous Research logoNous Research
Hermes 4 - Llama-3.1 70B (Reasoning)47.2215.994.004.501.000.002.000.00
66
Nous Research logoNous Research
Hermes 4 - Llama-3.1 405B (Reasoning)47.2218.564.004.501.000.002.000.00
67
Nous Research logoNous Research
Hermes 4 - Llama-3.1 70B (Non-reasoning)47.2212.634.004.501.000.002.000.00
68
Nous Research logoNous Research
Hermes 4 - Llama-3.1 405B (Non-reasoning)47.2217.634.004.501.000.002.000.00
69
ServiceNow logoServiceNow
Apriel-v1.5-15B-Thinker47.2228.336.002.500.000.001.000.00
70
Google logoGoogle
Gemma 3 270M44.447.716.002.000.000.000.000.00
71
TII UAE logoTII UAE
Falcon-H1R-7B44.4415.804.004.001.000.001.000.00
72
NVIDIA logoNVIDIA
Llama 3.1 Nemotron Instruct 70B44.4413.444.004.000.000.001.001.00
73
LongCat logoLongCat
LongCat Flash Lite44.4423.936.002.000.000.000.000.00
74
Alibaba logoAlibaba
Qwen3 Next 80B A3B (Reasoning)44.4426.726.002.000.000.000.000.00
75
Alibaba logoAlibaba
Qwen3 Coder 480B A35B Instruct44.4424.776.002.000.000.000.000.00
76
Alibaba logoAlibaba
Qwen3 Next 80B A3B Instruct44.4420.116.002.000.000.000.000.00
77
Alibaba logoAlibaba
Qwen3 Omni 30B A3B Instruct44.4410.686.002.000.000.000.000.00
78
Alibaba logoAlibaba
Qwen3 Omni 30B A3B (Reasoning)44.4415.626.002.000.000.000.000.00
79
InclusionAI logoInclusionAI
Ling-flash-2.044.4415.746.002.000.000.000.000.00
80
InclusionAI logoInclusionAI
Ling-mini-2.044.449.196.002.000.000.000.000.00
81
InclusionAI logoInclusionAI
Ling-1T44.4419.046.002.000.000.000.000.00
82
Mistral logoMistral
Devstral Small (Jul '25)44.4415.216.002.000.000.000.000.00
83
DeepSeek logoDeepSeek
DeepSeek V3.2 Exp (Reasoning)44.4432.946.002.000.000.000.000.00
84
DeepSeek logoDeepSeek
DeepSeek V3.2 Exp (Non-reasoning)44.4428.446.002.000.000.000.000.00
85
Kimi logoKimi
Kimi K244.4426.324.004.001.000.001.000.00
86
Z AI logoZ AI
GLM-4.7 (Reasoning)44.4442.116.002.000.000.000.000.00
87
Z AI logoZ AI
GLM-4.7 (Non-reasoning)44.4434.166.002.000.000.000.000.00
88
Z AI logoZ AI
GLM-4.6 (Reasoning)44.4432.516.002.000.000.000.000.00
89
Z AI logoZ AI
GLM-4.7-Flash (Reasoning)44.4430.156.002.000.000.000.000.00
90
Z AI logoZ AI
GLM-4.7-Flash (Non-reasoning)44.4422.076.002.000.000.000.000.00
91
Z AI logoZ AI
GLM-4.6 (Non-reasoning)44.4430.246.002.000.000.000.000.00
92
Alibaba logoAlibaba
Qwen3 Coder 30B A3B Instruct44.4419.986.002.000.000.000.000.00
93
Alibaba logoAlibaba
Qwen3 30B A3B 2507 Instruct44.4415.006.002.000.000.000.000.00
94
Alibaba logoAlibaba
Qwen3 30B A3B 2507 (Reasoning)44.4422.416.002.000.000.000.000.00
95
Alibaba logoAlibaba
Qwen3 235B A22B 2507 (Reasoning)44.4429.546.002.000.000.000.000.00
96
Alibaba logoAlibaba
Qwen3 235B A22B 2507 Instruct44.4424.966.002.000.000.000.000.00
97
Alibaba logoAlibaba
Qwen3 4B 2507 (Reasoning)44.4418.186.002.000.000.000.000.00
98
Alibaba logoAlibaba
Qwen3 4B 2507 Instruct44.4412.886.002.000.000.000.000.00
99
ByteDance Seed logoByteDance Seed
Seed-OSS-36B-Instruct44.4425.166.002.000.000.000.000.00
100
Alibaba logoAlibaba
Qwen3 Coder Next41.6728.286.001.500.000.001.000.00
101
OpenAI logoOpenAI
gpt-oss-120B (high)38.8933.276.001.000.000.000.000.00
102
OpenAI logoOpenAI
gpt-oss-20B (high)38.8924.476.001.000.000.000.000.00
103
Meta logoMeta
Llama 3.3 Instruct 70B38.8914.494.003.001.000.001.000.00
104
Meta logoMeta
Llama 3.1 Instruct 405B38.8917.384.003.001.000.001.000.00
105
Meta logoMeta
Llama 3.2 Instruct 90B (Vision)38.8911.904.003.001.000.001.000.00
106
Meta logoMeta
Llama 3.2 Instruct 11B (Vision)38.898.734.003.001.000.001.000.00
107
Mistral logoMistral
Mistral Small 4 (Non-reasoning)38.8918.626.001.000.000.000.000.00
108
Mistral logoMistral
Mistral Large 338.8922.806.001.000.000.000.000.00
109
Mistral logoMistral
Mistral Small 4 (Reasoning)38.8927.196.001.000.000.000.000.00
110
Perplexity logoPerplexity
R1 177638.8911.996.001.000.000.000.000.00
111
Reka AI logoReka AI
Reka Flash 338.899.526.001.000.000.000.000.00
112
Nous Research logoNous Research
DeepHermes 3 - Mistral 24B Preview (Non-reasoning)38.8910.896.001.000.000.000.000.00
113
Sarvam logoSarvam
Sarvam 30B (high)38.8912.346.001.000.000.000.000.00
114
Sarvam logoSarvam
Sarvam 105B (high)38.8918.166.001.000.000.000.000.00
115
Deep Cogito logoDeep Cogito
Cogito v2.1 (Reasoning)38.89-6.001.000.000.000.000.00
116
AI21 Labs logoAI21 Labs
Jamba Reasoning 3B38.899.606.001.000.000.000.000.00
117
Alibaba logoAlibaba
Qwen3.5 4B (Non-reasoning)38.8922.606.001.000.000.000.000.00
118
Alibaba logoAlibaba
Qwen3.5 0.8B (Non-reasoning)38.899.916.001.000.000.000.000.00
119
Alibaba logoAlibaba
Qwen3.5 4B (Reasoning)38.8927.086.001.000.000.000.000.00
120
Alibaba logoAlibaba
Qwen3.5 9B (Reasoning)38.8932.436.001.000.000.000.000.00
121
Alibaba logoAlibaba
Qwen3.5 397B A17B (Non-reasoning)38.8940.106.001.000.000.000.000.00
122
Alibaba logoAlibaba
Qwen3.5 397B A17B (Reasoning)38.8945.056.001.000.000.000.000.00
123
Alibaba logoAlibaba
Qwen3.5 122B A10B (Reasoning)38.8941.606.001.000.000.000.000.00
124
Alibaba logoAlibaba
Qwen3.5 35B A3B (Reasoning)38.8937.126.001.000.000.000.000.00
125
Alibaba logoAlibaba
Qwen3.5 27B (Reasoning)38.8942.076.001.000.000.000.000.00
126
Alibaba logoAlibaba
Qwen3.5 2B (Reasoning)38.8916.296.001.000.000.000.000.00
127
Alibaba logoAlibaba
Qwen3.5 0.8B (Reasoning)38.8910.526.001.000.000.000.000.00
128
Alibaba logoAlibaba
Qwen3.5 27B (Non-reasoning)38.8937.186.001.000.000.000.000.00
129
Alibaba logoAlibaba
Qwen3.5 122B A10B (Non-reasoning)38.8935.876.001.000.000.000.000.00
130
Alibaba logoAlibaba
Qwen3.5 35B A3B (Non-reasoning)38.8930.696.001.000.000.000.000.00
131
Alibaba logoAlibaba
Qwen3.5 9B (Non-reasoning)38.8927.336.001.000.000.000.000.00
132
Alibaba logoAlibaba
Qwen3.5 2B (Non-reasoning)38.8914.676.001.000.000.000.000.00
133
InclusionAI logoInclusionAI
Ring-1T38.8922.786.001.000.000.000.000.00
134
InclusionAI logoInclusionAI
Ring-flash-2.038.8914.026.001.000.000.000.000.00
135
Mistral logoMistral
Mistral Small 3.238.8915.076.001.000.000.000.000.00
136
DeepSeek logoDeepSeek
DeepSeek V3.1 Terminus (Reasoning)38.8933.936.001.000.000.000.000.00
137
DeepSeek logoDeepSeek
DeepSeek V3.1 Terminus (Non-reasoning)38.8928.526.001.000.000.000.000.00
138
DeepSeek logoDeepSeek
DeepSeek R1 Distill Llama 70B36.1115.954.002.500.000.001.000.00
139
Liquid AI logoLiquid AI
LFM2 8B A1B33.337.034.002.000.000.000.000.00
140
Liquid AI logoLiquid AI
LFM2 2.6B33.338.044.002.000.000.000.000.00
141
Kimi logoKimi
Kimi K2.5 (Reasoning)33.3346.814.002.000.000.000.000.00
142
Nous Research logoNous Research
DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)33.337.585.001.000.000.000.000.00
143
Cohere logoCohere
Command A33.3313.483.003.000.000.000.000.00
144
Liquid AI logoLiquid AI
LFM2 1.2B33.336.334.002.000.000.000.000.00
145
Naver logoNaver
HyperCLOVA X SEED Think (32B)30.5623.724.001.501.000.000.000.00
146
Meta logoMeta
Llama 4 Maverick27.7818.364.001.000.000.000.000.00
147
Meta logoMeta
Llama 4 Scout27.7813.524.001.000.000.000.000.00
148
Mistral logoMistral
Magistral Medium 1.227.7827.102.003.000.000.001.001.00
149
Liquid AI logoLiquid AI
LFM2.5-1.2B-Instruct27.788.044.001.000.000.000.000.00
150
Liquid AI logoLiquid AI
LFM2 24B A2B27.7810.494.001.000.000.000.000.00
151
Liquid AI logoLiquid AI
LFM2.5-VL-1.6B27.786.184.001.000.000.000.000.00
152
Liquid AI logoLiquid AI
LFM2.5-1.2B-Thinking27.788.084.001.000.000.000.000.00
153
LG AI Research logoLG AI Research
K-EXAONE (Reasoning)27.7832.124.001.000.000.000.000.00
154
LG AI Research logoLG AI Research
Exaone 4.0 1.2B (Reasoning)27.788.263.002.000.000.000.000.00
155
LG AI Research logoLG AI Research
Exaone 4.0 1.2B (Non-reasoning)27.788.113.002.000.000.000.000.00
156
LG AI Research logoLG AI Research
EXAONE 4.0 32B (Reasoning)27.7816.683.002.000.000.000.000.00
157
LG AI Research logoLG AI Research
EXAONE 4.0 32B (Non-reasoning)27.7811.663.002.000.000.000.000.00
158
MiniMax logoMiniMax
MiniMax-M2.127.7839.424.001.000.000.000.000.00
159
MiniMax logoMiniMax
MiniMax-M2.527.7841.934.001.000.000.000.000.00
160
MiniMax logoMiniMax
MiniMax-M227.7836.094.001.000.000.000.000.00
161
Kimi logoKimi
Kimi K2 090527.7830.854.001.000.000.000.000.00
162
Kimi logoKimi
Kimi K2 Thinking27.7840.894.001.000.000.000.000.00
163
AI21 Labs logoAI21 Labs
Jamba 1.7 Mini22.228.074.000.000.000.000.000.00
164
AI21 Labs logoAI21 Labs
Jamba 1.7 Large22.2210.884.000.000.000.000.000.00
165
Alibaba logoAlibaba
Qwen3 Max Thinking (Preview)16.6732.482.001.000.000.000.000.00
166
Alibaba logoAlibaba
Qwen3 Max16.6731.382.001.000.000.000.000.00
167
Google logoGoogle
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)11.1119.422.000.000.000.000.000.00
168
Anthropic logoAnthropic
Claude 4.5 Haiku (Reasoning)11.1137.092.000.000.000.000.000.00
169
Anthropic logoAnthropic
Claude 4.5 Haiku (Non-reasoning)11.1131.052.000.000.000.000.000.00
170
Mistral logoMistral
Mistral Medium 3.111.1121.252.000.000.000.000.000.00
171
xAI logoxAI
Grok 3 mini Reasoning (high)11.1132.082.000.000.000.000.000.00
172
Amazon logoAmazon
Nova Micro11.1110.272.000.000.000.000.000.00
173
Amazon logoAmazon
Nova Premier11.1119.012.000.000.000.000.000.00
174
Upstage logoUpstage
Solar Pro 2 (Non-reasoning)11.1113.592.000.000.000.000.000.00
175
Upstage logoUpstage
Solar Pro 2 (Reasoning)11.1114.922.000.000.000.000.000.00
176
ByteDance Seed logoByteDance Seed
Doubao Seed Code11.1133.522.000.000.000.000.000.00
177
OpenAI logoOpenAI
GPT-5.1 (Non-reasoning)11.1127.422.000.000.000.000.000.00
178
OpenAI logoOpenAI
GPT-5 (ChatGPT)11.1121.832.000.000.000.000.000.00
179
Google logoGoogle
Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)11.1125.702.000.000.000.000.000.00
180
Anthropic logoAnthropic
Claude 4.5 Sonnet (Reasoning)11.1143.032.000.000.000.000.000.00
181
Anthropic logoAnthropic
Claude Opus 4.5 (Non-reasoning)11.1143.092.000.000.000.000.000.00
182
Anthropic logoAnthropic
Claude 4.5 Sonnet (Non-reasoning)11.1137.142.000.000.000.000.000.00
183
Anthropic logoAnthropic
Claude Opus 4.5 (Reasoning)11.1149.732.000.000.000.000.000.00
184
Mistral logoMistral
Devstral Medium11.1118.662.000.000.000.000.000.00
185
xAI logoxAI
Grok 4 Fast (Non-reasoning)11.1123.122.000.000.000.000.000.00
186
xAI logoxAI
Grok 4.1 Fast (Non-reasoning)11.1123.562.000.000.000.000.000.00
187
Amazon logoAmazon
Nova Pro11.1113.482.000.000.000.000.000.00
188
Amazon logoAmazon
Nova Lite11.1112.652.000.000.000.000.000.00
189
OpenAI logoOpenAI
o35.5638.371.000.000.000.000.000.00
190
Google logoGoogle
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)5.5621.651.000.000.000.000.000.00
191
Google logoGoogle
Gemini 2.5 Pro5.5634.631.000.000.000.000.000.00
192
xAI logoxAI
Grok Code Fast 15.5628.741.000.000.000.000.000.00
193
OpenAI logoOpenAI
GPT-5 (minimal)5.5623.891.000.000.000.000.000.00
194
OpenAI logoOpenAI
GPT-5 mini (minimal)5.5620.681.000.000.000.000.000.00
195
OpenAI logoOpenAI
GPT-5 nano (medium)5.5625.881.000.000.000.000.000.00
196
OpenAI logoOpenAI
GPT-5 nano (high)5.5626.831.000.000.000.000.000.00
197
OpenAI logoOpenAI
GPT-5 mini (medium)5.5638.941.000.000.000.000.000.00
198
OpenAI logoOpenAI
GPT-5 (high)5.5644.631.000.000.000.000.000.00
199
OpenAI logoOpenAI
GPT-5 (medium)5.5642.031.000.000.000.000.000.00
200
OpenAI logoOpenAI
GPT-5 (low)5.5639.201.000.000.000.000.000.00
201
OpenAI logoOpenAI
GPT-5 nano (minimal)5.5613.841.000.000.000.000.000.00
202
OpenAI logoOpenAI
GPT-5.1 (high)5.5647.701.000.000.000.000.000.00
203
OpenAI logoOpenAI
GPT-5 Codex (high)5.5644.631.000.000.000.000.000.00
204
OpenAI logoOpenAI
GPT-5 mini (high)5.5641.171.000.000.000.000.000.00
205
Google logoGoogle
Gemini 3 Pro Preview (high)5.5648.391.000.000.000.000.000.00
206
Google logoGoogle
Gemini 2.5 Flash Preview (Sep '25) (Reasoning)5.5631.141.000.000.000.000.000.00
207
xAI logoxAI
Grok 4 Fast (Reasoning)5.5635.061.000.000.000.000.000.00
208
xAI logoxAI
Grok 4.1 Fast (Reasoning)5.5638.611.000.000.000.000.000.00
209
xAI logoxAI
Grok 45.5641.521.000.000.000.000.000.00

Explore Evaluations

Artificial Analysis Intelligence IndexArtificial Analysis Intelligence Index

A composite benchmark aggregating ten challenging evaluations to provide a holistic measure of AI capabilities across mathematics, science, coding, and reasoning.

GDPval-AA LeaderboardGDPval-AA Leaderboard

GDPval-AA is Artificial Analysis' evaluation framework for OpenAI's GDPval dataset. It tests AI models on real-world tasks across 44 occupations and 9 major industries. Models are given shell access and web browsing capabilities in an agentic loop via Stirrup to solve tasks, with ELO ratings derived from blind pairwise comparisons.

𝜏²-Bench Telecom Benchmark Leaderboard𝜏²-Bench Telecom Benchmark Leaderboard

A dual-control conversational AI benchmark simulating technical support scenarios where both agent and user must coordinate actions to resolve telecom service issues.

Terminal-Bench Hard Benchmark LeaderboardTerminal-Bench Hard Benchmark Leaderboard

An agentic benchmark evaluating AI capabilities in terminal environments through software engineering, system administration, and data processing tasks.

SciCode Benchmark LeaderboardSciCode Benchmark Leaderboard

A scientist-curated coding benchmark featuring 338 sub-tasks derived from 80 genuine laboratory problems across 16 scientific disciplines.

Artificial Analysis Long Context Reasoning Benchmark LeaderboardArtificial Analysis Long Context Reasoning Benchmark Leaderboard

A challenging benchmark measuring language models' ability to extract, reason about, and synthesize information from long-form documents ranging from 10k to 100k tokens (measured using the cl100k_base tokenizer).

AA-Omniscience: Knowledge and Hallucination BenchmarkAA-Omniscience: Knowledge and Hallucination Benchmark

A benchmark measuring factual recall and hallucination across various economically relevant domains.

IFBench Benchmark LeaderboardIFBench Benchmark Leaderboard

A benchmark evaluating precise instruction-following generalization on 58 diverse, verifiable out-of-domain constraints that test models' ability to follow specific output requirements.

Humanity's Last Exam Benchmark LeaderboardHumanity's Last Exam Benchmark Leaderboard

A frontier-level benchmark with 2,500 expert-vetted questions across mathematics, sciences, and humanities, designed to be the final closed-ended academic evaluation.

GPQA Diamond Benchmark Leaderboard

The most challenging 198 questions from GPQA, where PhD experts achieve 65% accuracy but skilled non-experts only reach 34% despite web access.

CritPt Benchmark LeaderboardCritPt Benchmark Leaderboard

A benchmark designed to test LLMs on research-level physics reasoning tasks, featuring 71 composite research challenges.

Artificial Analysis Openness IndexArtificial Analysis Openness Index

A composite measure providing an industry standard to communicate model openness for users and developers.

MMLU-Pro Benchmark LeaderboardMMLU-Pro Benchmark Leaderboard

An enhanced version of MMLU with 12,000 graduate-level questions across 14 subject areas, featuring ten answer options and deeper reasoning requirements.

Global-MMLU-Lite Benchmark LeaderboardGlobal-MMLU-Lite Benchmark Leaderboard

A lightweight, multilingual version of MMLU, designed to evaluate knowledge and reasoning skills across a diverse range of languages and cultural contexts.

LiveCodeBench Benchmark LeaderboardLiveCodeBench Benchmark Leaderboard

A contamination-free coding benchmark that continuously harvests fresh competitive programming problems from LeetCode, AtCoder, and CodeForces, evaluating code generation, self-repair, and execution.

MATH-500 Benchmark LeaderboardMATH-500 Benchmark Leaderboard

A 500-problem subset from the MATH dataset, featuring competition-level mathematics across six domains including algebra, geometry, and number theory.

AIME 2025 Benchmark LeaderboardAIME 2025 Benchmark Leaderboard

All 30 problems from the 2025 American Invitational Mathematics Examination, testing olympiad-level mathematical reasoning with integer answers from 000-999.

MMMU-Pro Benchmark LeaderboardMMMU-Pro Benchmark Leaderboard

An enhanced MMMU benchmark that eliminates shortcuts and guessing strategies to more rigorously test multimodal models across 30 academic disciplines.