Legal Index
Measures performance on capabilities that matter most for legal work, including legal knowledge, non-hallucination, agentic execution, long-context reading, and reasoning. Weights are derived from the relative frequency of those capabilities across the top tasks performed by lawyers, paralegals, and compliance officers.
See representative workflowsThe Artificial Analysis Legal Index combines performance across benchmarks chosen for legal practice. Weights follow how often each capability appears in example tasks for lawyers, paralegals, and compliance officers—using occupational task groupings (O*NET-style), where shared work matters more than job labels alone.
This composite metric prevents narrow specialization and provides a single score for tracking model performance across legal tasks.
Each capability sub-score is normalised to a 0-100 scale, then combined using the weights below. All underlying benchmarks are run independently by Artificial Analysis. See our Intelligence Benchmarking Methodology for how evaluations are conducted.
| Category | Weight | Evaluations |
|---|---|---|
| Legal Knowledge | 30% | AA-Omniscience Law Accuracy |
| Agentic | 25% | GDPval-AA v2 |
| Non-Hallucination | 15% | AA-Omniscience Non-Hallucination |
| Long-Context | 15% | LCR |
| Reasoning | 15% | HLE |
Score
Legal Index
Legal Index: Capability Breakdown
Capability Breakdown
Legal Index: Legal Knowledge
Representative Workflows
Real-world workflows that exercise the capabilities the Legal Index weights most heavily.
Release Date
Legal Index vs. Release Date
Speed
Legal Index vs. Output Speed
Pricing
Pricing: Input and Output Prices
Legal Index vs. Price
Token Usage
Legal Index: Output Token Composition
Cost
Legal Index: Cost Breakdown
Frequently Asked Questions
The Legal Index is a composite benchmark from Artificial Analysis that measures performance on capabilities that matter most for legal work, including legal knowledge, non-hallucination, agentic execution, long-context reading, and reasoning. Weights are derived from the relative frequency of those capabilities across the top tasks performed by lawyers, paralegals, and compliance officers.
The Legal Index is calculated as a weighted average of capability sub-scores, each normalised to a 0–100 scale. The sub-scores and their weights are: Legal Knowledge (30%), Agentic (25%), Non-Hallucination (15%), Long-Context (15%), and Reasoning (15%).
The Legal Index includes AA-Omniscience Law Accuracy, GDPval-AA v2, AA-Omniscience Non-Hallucination, LCR, and HLE.
Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback) currently has the highest Legal Index score, with a score of 62 among models with published results. View model
A higher Legal Index score indicates stronger overall performance across the benchmarks that make up the index. For a specific use case, individual benchmark results may be more informative than the composite score.