LLM Leaderboard - Comparison of over 100 AI models from OpenAI, Google, DeepSeek & others
Comparison and ranking the performance of over 100 AI models (LLMs) across key metrics including intelligence, price, performance and speed (output speed - tokens per second & latency - TTFT), context window & others. For more details including relating to our methodology, see our FAQs.
HIGHLIGHTS
Key definitions
Frequently Asked Questions
Gemini 3.1 Pro Preview currently ranks #1 on the Artificial Analysis LLM Leaderboard with an Intelligence Index score of 57, out of 295 models ranked.
The top models by Intelligence Index are: 1. Gemini 3.1 Pro Preview (57), 2. GPT-5.4 (xhigh) (57), 3. GPT-5.3 Codex (xhigh) (54), 4. Claude Opus 4.6 (Adaptive Reasoning, Max Effort) (53), 5. Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) (52).
Mercury 2 is the fastest at 877.3 tokens per second, followed by Granite 4.0 H Small (474.9 t/s) and NVIDIA Nemotron 3 Super 120B A12B (Reasoning) (433.5 t/s).
Gemma 3n E4B Instruct is the most affordable at $0.03 per 1M tokens (blended 3:1 input-to-output), followed by LFM2 24B A2B ($0.05) and Nova Micro ($0.06).
GLM-5 (Reasoning) is the highest-ranked open weights model with an Intelligence Index score of 50. There are 193 open weights models out of 295 total on the leaderboard.
The top open weights models by Intelligence Index are: 1. GLM-5 (Reasoning) (50), 2. Kimi K2.5 (Reasoning) (47), 3. Qwen3.5 397B A17B (Reasoning) (45).
Gemini 3.1 Pro Preview leads among 146 reasoning models with an Intelligence Index score of 57. Reasoning models use extended thinking to solve complex problems before responding.
The leaderboard includes filters to narrow results by model type (reasoning vs non-reasoning), openness (open weights vs proprietary), and other criteria. You can also adjust prompt options to see how performance varies with different input lengths.
Click on any model name in the leaderboard to visit its dedicated comparison page with detailed charts covering intelligence, pricing, speed, latency, and more. You can also compare API providers for each model. View all models