Articles

Qwen3.5 small models: Everything you need to know
March 5, 2026

Gemini 3.1 Pro Preview: The new leader in AI
February 19, 2026

Sonnet 4.6 - Everything you need to know
February 18, 2026

AA-WER v2.0: Speech to Text Accuracy Benchmark
February 18, 2026

Claude Sonnet 4.6 - New leader in GDPval-AA
February 17, 2026

Qwen3.5-397B-A17B - Everything you need to know
February 17, 2026

MiniMax-M2.5: Everything you need to know
February 14, 2026

GLM-5 - Everything you need to know
February 11, 2026

Opus 4.6 - Everything you need to know
February 7, 2026

Opus 4.6 Takes Lead in Agentic Real-World Knowledge Tasks
February 5, 2026

Qwen3 Max Thinking Benchmarks and Analysis
January 29, 2026

Kimi K2.5 - Everything you need to know
January 28, 2026

Gemini 3 Flash - Everything you need to know
December 17, 2025

Stirrup: Our new open source framework for building agents
December 11, 2025

Introducing the Artificial Analysis Openness Index
December 1, 2025

Claude Opus 4.5 Benchmarks and Analysis
November 25, 2025

Gemini 3 Pro - Everything you need to know
November 18, 2025

AA-Omniscience: Knowledge and Hallucination Benchmark
November 16, 2025

Kimi K2-Thinking - Everything you need to know
November 7, 2025

MiniMax M2 Benchmarks & Analysis
October 27, 2025

GPT-5 Benchmarks and Analysis
August 7, 2025

Analysis of OpenAI's gpt-oss models
August 6, 2025

Announcing Artificial Analysis Long Context Reasoning (AA-LCR)
August 5, 2025

Independent Performance Analysis of Leading GPUs
June 9, 2025

DeepSeek R1 Update
May 29, 2025

Overview of Google I/O Benchmarking Results
May 21, 2025