Stay connected with us on X, Discord, and LinkedIn to stay up to date with future analysis

Articles

Qwen3.5 small models: Everything you need to know

Qwen3.5 small models: Everything you need to know

March 5, 2026

Gemini 3.1 Pro Preview: The new leader in AI

Gemini 3.1 Pro Preview: The new leader in AI

February 19, 2026

Sonnet 4.6 - Everything you need to know

Sonnet 4.6 - Everything you need to know

February 18, 2026

AA-WER v2.0: Speech to Text Accuracy Benchmark

AA-WER v2.0: Speech to Text Accuracy Benchmark

February 18, 2026

Claude Sonnet 4.6 - New leader in GDPval-AA

Claude Sonnet 4.6 - New leader in GDPval-AA

February 17, 2026

Qwen3.5-397B-A17B - Everything you need to know

Qwen3.5-397B-A17B - Everything you need to know

February 17, 2026

MiniMax-M2.5: Everything you need to know

MiniMax-M2.5: Everything you need to know

February 14, 2026

GLM-5 - Everything you need to know

GLM-5 - Everything you need to know

February 11, 2026

Opus 4.6 - Everything you need to know

Opus 4.6 - Everything you need to know

February 7, 2026

Opus 4.6 Takes Lead in Agentic Real-World Knowledge Tasks

Opus 4.6 Takes Lead in Agentic Real-World Knowledge Tasks

February 5, 2026

Qwen3 Max Thinking Benchmarks and Analysis

Qwen3 Max Thinking Benchmarks and Analysis

January 29, 2026

Kimi K2.5 - Everything you need to know

Kimi K2.5 - Everything you need to know

January 28, 2026

Gemini 3 Flash - Everything you need to know

Gemini 3 Flash - Everything you need to know

December 17, 2025

Stirrup: Our new open source framework for building agents

Stirrup: Our new open source framework for building agents

December 11, 2025

Introducing the Artificial Analysis Openness Index

Introducing the Artificial Analysis Openness Index

December 1, 2025

Claude Opus 4.5 Benchmarks and Analysis

Claude Opus 4.5 Benchmarks and Analysis

November 25, 2025

Gemini 3 Pro - Everything you need to know

Gemini 3 Pro - Everything you need to know

November 18, 2025

AA-Omniscience: Knowledge and Hallucination Benchmark

AA-Omniscience: Knowledge and Hallucination Benchmark

November 16, 2025

Kimi K2-Thinking - Everything you need to know

Kimi K2-Thinking - Everything you need to know

November 7, 2025

MiniMax M2 Benchmarks & Analysis

MiniMax M2 Benchmarks & Analysis

October 27, 2025

GPT-5 Benchmarks and Analysis

GPT-5 Benchmarks and Analysis

August 7, 2025

Analysis of OpenAI's gpt-oss models

Analysis of OpenAI's gpt-oss models

August 6, 2025

Announcing Artificial Analysis Long Context Reasoning (AA-LCR)

Announcing Artificial Analysis Long Context Reasoning (AA-LCR)

August 5, 2025

Independent Performance Analysis of Leading GPUs

Independent Performance Analysis of Leading GPUs

June 9, 2025

DeepSeek R1 Update

DeepSeek R1 Update

May 29, 2025

Overview of Google I/O Benchmarking Results

Overview of Google I/O Benchmarking Results

May 21, 2025