All capability indexes

Legal Index

Measures performance on capabilities that matter most for legal work, including legal knowledge, non-hallucination, agentic execution, long-context reading, and reasoning. Weights are derived from the relative frequency of those capabilities across the top tasks performed by lawyers, paralegals, and compliance officers.

See representative workflows

The Artificial Analysis Legal Index combines performance across benchmarks chosen for legal practice. Weights follow how often each capability appears in example tasks for lawyers, paralegals, and compliance officers—using occupational task groupings (O*NET-style), where shared work matters more than job labels alone.

This composite metric prevents narrow specialization and provides a single score for tracking model performance across legal tasks.

Each capability sub-score is normalised to a 0-100 scale, then combined using the weights below. All underlying benchmarks are run independently by Artificial Analysis. See our Intelligence Benchmarking Methodology for how evaluations are conducted.

CategoryWeightEvaluations
Legal Knowledge30%AA-Omniscience Law Accuracy
Agentic25%GDPval-AA v2
Non-Hallucination15%AA-Omniscience Non-Hallucination
Long-Context15%LCR
Reasoning15%HLE

Score

Capability Breakdown

Representative Workflows

Real-world workflows that exercise the capabilities the Legal Index weights most heavily.

Release Date

Speed

Pricing

Pricing: Input and Output Prices

USD per 1M tokens (blended)
Reasoning models are indicated by a lightbulb icon

Price per token included in the request/message sent to the API, represented as USD per million Tokens.

Figures represent median (P50) measurement over the past 72 hours to reflect sustained changes in performance.

Token Usage

Cost

Frequently Asked Questions

The Legal Index is a composite benchmark from Artificial Analysis that measures performance on capabilities that matter most for legal work, including legal knowledge, non-hallucination, agentic execution, long-context reading, and reasoning. Weights are derived from the relative frequency of those capabilities across the top tasks performed by lawyers, paralegals, and compliance officers.

The Legal Index is calculated as a weighted average of capability sub-scores, each normalised to a 0–100 scale. The sub-scores and their weights are: Legal Knowledge (30%), Agentic (25%), Non-Hallucination (15%), Long-Context (15%), and Reasoning (15%).

The Legal Index includes AA-Omniscience Law Accuracy, GDPval-AA v2, AA-Omniscience Non-Hallucination, LCR, and HLE.

Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback) currently has the highest Legal Index score, with a score of 62 among models with published results. View model

A higher Legal Index score indicates stronger overall performance across the benchmarks that make up the index. For a specific use case, individual benchmark results may be more informative than the composite score.